Join our daily and weekly newsletters for the latest updates and the exclusive content on AI coverage. Learn more
A team of researchers has introduced Light-R1-32B, a new Open Source model optimized to solve advanced mathematical problems, which makes it available on Face Under an Apache 2.0 Permissive – free license for companies and researchers to take, deploy, refine or modify as they wish, even for commercial purposes.
The 32 billion parameter model (number of model parameters) exceeds the performance of open source models similar (and even larger) such as Deepseek-R1-Distill-Lalma-70B and Deepseek-R1-Distill-Qwen-32B on a third-party reference of the third party reference third party third party The third reference the third benchmark the shooting American invitational mathematics examination (love)which contains 15 mathematical problems designed for extremely advanced students and has a time limit of 3 hours for human users.
Developed by Liang Wen, Fenrui Xiao, Xin He, Yunke Cai, Qi An, Zhenyu Duan, Yimin du, Junoun Liu, Lifif Tang, Xiaowei LV, Haosheng Zou, Yongchao Deng, Shousen Jia and Xiangzheng Zhang, the open surpasse model.
Incredibly, the researchers finished the formation of the model in less than six hours out of 12 NVIDIA H800 GPU at a total cost estimated at $ 1,000. This makes Light-R1-32B one of the most accessible and practical approaches to develop models of specialized AI mathematics. However, it is important to remember that the model has been formed on a variant of Open Source d’Alibaba 2.5-32B-ISTRUCTWho himself is presumed to have had much higher initial training costs.
In addition to the model, the team has published its training data sets, its training scripts and its evaluation tools, offering a transparent and accessible framework to create models of mathematics.
The arrival of Light-R1-32B follows other similar efforts of rivals such as Microsoft with its Orca-Math series.
A new mathematical king emerges
Light-R1-32B is designed to combat complex mathematical reasoning, in particular on the AIME references (American Invitational Mathematics Examination).
It was formed from Qwen2.5-32B-ISTRUCT, from a long-term reasoning model (COT). The team applied a final adjustment (SFT) based on a curriculum (SFT) and a direct optimization of preferences (DPO) to refine its problem solving capacities.
When evaluated, Light-R1-32B reached 76.6 on AIME24 and 64.6 on AIME25, exceeding the depth-R1-DISTILL-QWEN-32B, which obtained a score respectively 72.6 and 54.9.
This improvement suggests that the curriculum -based training approach effectively improves mathematical reasoning, even during training from models that initially lack long cot.
Fair benchmarking
To ensure a fair comparative analysis, the team has decontaminated training data against common reasoning, including AIM24 / 25, MATH-500 and GPQA DIAMOND, preventing data leak.
They also implemented the response filtering based on the difficulty using DeepScal-1.5B-Preview, finally forming a set of data of 76,000 examples for the first stage of the supervised fine adjustment. A second more difficult data set of 3,000 examples has improved performance.
After training, the team merged several versions formed from Light-R1-32B, which resulted in additional gains. In particular, the model maintains strong generalization capacities on scientific reasoning tasks (GPQA), although it is specialized by mathematics.
How companies can benefit
Light-R1-32B is released under the Apache 2.0 license, an open-source permissive license which allows free use, a modification and a commercial deployment without requiring that the derivatives be open-source. T
It makes it an attractive option for companies, AI developers and software engineers looking to integrate or personalize the proprietary applications model.
The license also includes a global patent subsidy free of rights, reducing legal risks for companies while discouraging patent disputes. Companies can freely deploy from light R1-32B in commercial products, maintaining total control of their innovations while benefiting from an open and transparent AI ecosystem.
For CEOs, CTOs and IT managers, Apache 2.0 ensures the profitability and independence of suppliers, eliminating license costs and restrictive dependencies on the solutions of the owner. The developers and engineers of AI acquire flexibility to refine, integrate and extend the limitless model, which makes it ideal for specialized mathematical reasoning, research and business AI applications. However, as the license offers no guarantee or coverage of responsibility, organizations should carry out their own safety, compliance and performance assessments before deploying light-R1-32B in critical environments.
Transparency in low -cost training and optimization for mathematical problem solving
The researchers point out that Light-R1-32B provides a validated and profitable means of training solid models of the thought chain in specialized fields.
By sharing their methodology, their training data and their code, they aim to reduce obstacles to costs for the development of high performance AI.
Future work includes exploration of strengthening learning (RL) to further improve the model’s reasoning capacities.