Current location: Home> Ai News

Decryption of the Dark Side o1: Long-CoT is the key, and model thinking needs to "take a long-term line"

Author: LoRA Time: 18 Feb 2025 238

Flood Sung, a researcher at the Dark Side of the Moon, recently published a long article of 10,000 words, which disclosed the research and development ideas of the k1.5 model for the first time and conducted in-depth reflection on the technical implications brought by the OpenAI o1 model.

According to Flood Sung, the importance of Long-CoT (Long Chain Thinking) was actually verified by Tim Zhou Xinyu, co-founder of the Dark Side of the Moon more than a year ago. Significant results can be achieved by using small models to train multi-digit operations and converting the fine-grained operation process into long-chain thinking data for SFT (supervised fine-tuning).

QQ20250217-143705.png

However, due to cost considerations, the Dark Side of the Moon has previously focused on the optimization of Long Context (Long Text Input). Flood Sung explained that Long Context mainly processes inputs, and with the help of Prefill pre-filling and Mooncake technology, it can better control costs and speed. In contrast, Long-CoT focuses on output, requiring higher costs and longer processing time.

But the release of OpenAI o1 has made the team rethink the priorities of the technology direction. "Performance is the most important thing," said Flood Sung. "Cost and speed will be continuously optimized with technological progress. The key is to achieve breakthrough performance first." Based on this understanding, the Dark Side of the Moon has begun to fully promote Long-CoT Research is committed to making models realize free thinking ability closer to humans.

The release of this technical decryption article marks that the dark side of the moon has begun to systematically benchmark the o1 model and conduct substantive research in related fields.

A long text for decrypting the o1 cracking process: https://mp.weixin.qq.com/s/sJmT-tM3A-mglZ1d4OI80A