my-server
← Wiki

Xiaomi MiMo

Xiaomi MiMo is a family of large language models (LLMs) developed by Xiaomi. Its first model MiMo-7B, with 7 billion parameters was launched in April, 2025. MiMo connected billions of IoT devices in Xiaomi's "Human × Car × Home" ecosystem, intergreting its smartphones, electric vehicles and home appliances into a unified, context-aware network.

Development

MiMo was developed by Xiaomi as a language model focused on reasoning. Its development team was led by Luo Fuli, who had previously worked at DeepSeek before joining Xiaomi in late 2025. The model was trained using multi-token prediction and reinforcement learning, with a particular emphasis on mathematical reasoning and code generation tasks. In March 2026, Xiaomi CEO Lei Jun announced that the company planned to invest at least US$8.7 billion in artificial intelligence over the following three years.

Models

MiMo-7B

MiMo-7B is the first model of this LLM. The base model, MiMo-7B-Base, was pre-trained on approximately 25 trillion tokens using web pages, academic papers, books, and synthetic reasoning data. MiMo-7B-RL underwent supervised fine-tuning and reinforcement learning on 130,000 mathematics and code problems.

MiMo-7B-RL-0530 which was released in May 2025, scaled the fine-tuning dataset from 500,000 to 6 million instances and extended the RL window from 32,000 to 48,000 tokens and improved AIME 2024 scores from 68.2 to 80.1.

MiMo-VL-7B was a vision-language model combining a Vision Transformer encoder with the MiMo-7B backbone. It was trained in four stages consuming 2.4 trillion tokens. Its reinforcement learning variant used Mixed On-Policy Reinforcement Learning (MORL) which integrated reward signals across perception, grounding, and reasoning. Xiaomi also released MiMo-Audio-7B, an audio-language model for voice conversion, style transfer, and speech editing.

MiMo-V2-Flash

MiMo-V2-Flash was launched in December 2025. It is a open-sourced Mixture-of-Experts model with 309 billion total parameters and 15 billion active, was trained on 27 trillion tokens using FP8 mixed precision. It used hybrid attention interleaving Sliding Window and Global Attention at a 5:1 ratio.

MiMo-V2-Pro

Xiaomi publicly introduced MiMo-V2-Pro in 18 March 2026. It has over 1 trillion total parameters, 42 billion active, and a 1-million-token context window. Before the official release, the model had appeared anonymously on OpenRouter under the codename "Hunter Alpha," where it drew substantial usage and topped daily charts for several days, according to Xiaomi and Reuters. During its listing on OpenRouter, the model reportedly processed over one trillion tokens in total usage. Xiaomi later said Hunter Alpha was an early internal test build of MiMo-V2-Pro, and Reuters reported that the model had been mistaken by some users for a possible DeepSeek system before Xiaomi confirmed its origin.

The model was released as a proprietary API product, and Luo Fuli stated that Xiaomi intended to open-source a variant at an unspecified future date. Xiaomi has partnered with several API web platforms like OpenClaw to launch the model. All these websites initially offered a free trial of this model for a week, but due to the overwhelming response, Xiaomi later extended the free trial period of the model until 2 April 2026.

MiMo-V2-Omni

Alongside MiMo-V2-Pro, Xiaomi launched MiMo-V2-Omni in 18 March 2026. It handles image, video, audio, and text inputs. Before the official release, it was codenamed “Healer Alpha" in OpenRouter.

MiMo-V2-TTS

On the same date as the release of MiMo-V2-Pro and MiMo-V2-Omni, a Text-to-Speech model named MiMo-V2-TTS was released also. It is a speech synthesis model. It was trained on audio data, which makes it capable of emotional transitions, mid-sentence tone shifts, singing, and synthesis of regional dialects like Sichuan, Cantonese, Henan, and Taiwanese.

See also

References

External links