Mistral AI releases Saba: AI models focusing on Middle East and Southeast Asian languages
Author: LoRATime: 18 Feb 2025485
Mistral AI recently launched a new language model called Saba, which focuses on improving understanding of language and cultural differences in the Middle East and Southeast Asia.
The Saba model has 24 billion parameters, and while smaller than many competitors, Mistral AI claims it provides higher speeds and lower costs while ensuring accuracy. Its architecture may be similar to the Mistral Small3 model. Saba is capable of running efficiently on low-performance systems, and even at a single GPU setup that can achieve speeds of more than 150 tokens per second.
The model is particularly good at dealing with Arabic and Hindi, including South Hindi, such as Tamil and Malayalam. Mistral AI benchmarks show that Saba excels in Arabic while maintaining comparable abilities to English.
Saba has been applied in real-world scenarios, including Arabic virtual assistants and dedicated tools in the energy, financial markets and healthcare sectors. Its understanding of local idioms and cultural references enables it to effectively generate content in a specific area.
Users can access Saba through paid APIs or local deployments. Like other models of Mistral AI, Saba is not an open source model.
Mistral's benchmark test shows that Saba performs well in Arabic and has comparable English skills | Source: Mistral AI
The launch of Saba reflects the AI field's attention to the needs of language models in specific regions. Similar research is being conducted by other organizations such as the OpenGPT-X project (release of Teuken-7B model), OpenAI (developing a Japanese-specific GPT-4 model) and the EuroLingua project (focusing on European languages).
Traditional large language models mainly rely on a large number of English text data sets for training, and it is easy to ignore the nuances of specific languages. Saba aims to fill this gap and provide more accurate and more language processing capabilities that are in line with the local cultural context.