Google uses small models to accelerate AI training of large models, improving efficiency by 28%

Author: LoRA Time: 08 Jan 2025 1017

In recent years, training large language models (LLMs) has become increasingly expensive and complex, and only a few large technology companies have the corresponding computing resources. However, Google recently launched a new method called SALT (Small Model Assisted Large Model Training), an innovation that may completely change the landscape of AI training.

Robot Artificial Intelligence AI (4)

According to the latest research paper from Google Research and DeepMind, "A little help goes a long way: Efficient LLM training by leveraging small language models," SALT introduces a new two-stage training process. This method is not only efficient, but also more practical, changing the way we train in the past.

The first stage of SALT is knowledge distillation. In this stage, the small language model (SLM) acts as a teacher, passing on its understanding to the larger model. Small models share their learned knowledge through "soft labels" to help large models master basic concepts in the early stages of learning. This stage is especially suitable for "simple" tasks where small models have strong prediction confidence in the learning area.

The second stage is self-supervised learning. Large models begin to learn independently at this stage, focusing on mastering more complex patterns and challenging tasks. This transition requires carefully designed strategies, including linear attenuation and linear proportional attenuation, which ensure a smooth transition for large models and gradually reduce dependence on small models.

Google researchers found in experiments that using a small model with 1.5 billion parameters to train a large model with 2.8 billion parameters reduced the training time on the "stack data set" by 28%. After fine-tuning, the accuracy of the large model on math problems increased from 31.84% to 34.87%, and the accuracy of reading comprehension also increased from 63.7% to 67%. This new method not only improves training efficiency, but also achieves significant improvements in performance.

The emergence of SALT is expected to lower the threshold for AI development, allowing many small research institutions and companies that are originally limited by resources to participate in the development of AI models. Opportunities for research and development will become more widespread, which may lead to the creation of more unique and specialized AI solutions, driving innovation and applications in related fields.

Tips & Information

Google uses small models to accelerate AI training of large models, improving efficiency by 28%

Christie's first AI art auction has caused controversy, with a transaction volume of US$728,000

Go out and ask to release TicVoice 7.0: Supernatural voice cloning and cross-lingual generation capabilities

SiMa.ai is named the Best Startup Employer in the United States for 3 consecutive years

Manus AI Partner: Limited server capacity has caused a huge response

How AI technology can modify Indian employee accents in real time to enhance customer service

How to apply for a monthly free credit limit for Grok 3 API? Grok 3 API Free Quota Application Tutorial

Trump shares AI-generated video: Political satire at Gaza resort

Prime Video experiments with AI dubbing technology to improve accessibility of multilingual content