Meta executives are obsessed with surpassing GPT-4, and training data faces copyright disputes!

Author: LoRA Time: 15 Jan 2025 257

Recently, with the progress of a case about artificial intelligence copyright - Kadrey v. Meta, internal information of Meta company was unsealed by the court, revealing that company executives were obsessed with surpassing OpenAI's GPT-4 model during the development of Llama3. .

Ahmad Al-Dahle, Meta’s vice president of generative AI, mentioned in an October 2023 message: “Honestly, our goal has to be GPT-4. We have 64,000 GPU! We need to learn how to build cutting-edge technology to win this competition.”

Meta, Metaverse, Facebook

Although Meta releases open AI models, the company's AI leadership is apparently more focused on competitors that do not disclose model weights, such as Anthropic and OpenAI, and regard their Claude and GPT-4 as working standards. Although the French AI startup Mistral has been mentioned many times, Meta executives appear to be quite dismissive of its evaluation. "Mistral is a piece of cake for us and we should be able to do better," Al-Dah said in the message.

In the field of AI, major companies are vying to launch advanced AI models, and these court documents show Meta's high tension in this competition. In multiple messages, Meta's AI leaders mentioned that they were "very active" in getting the data needed to train Llama. One executive even said: "Llama3 is the only thing I care about." They discussed how to improve the data set to improve Llama3's performance.

However, prosecutors in the case alleged that Meta executives may have made omissions in the use of data in their rush to launch AI models, involving some copyrighted books. Touvron mentioned that Llama2’s dataset combination “didn’t work well” and discussed how Llama3 could be improved with better data sources. Al-Dah asked: "Do we have the right data set? Is there anything that we can't use for stupid reasons?"

Meta CEO Mark Zuckerberg has previously expressed efforts to narrow the performance gap between the Llama model and closed-source models from OpenAI, Google and other companies. These internal sources reveal that Meta is under intense pressure to pursue this goal. Zuckerberg mentioned in a July 2024 letter: "This year, Llama3 is competitive among the most advanced models and leads in some areas."

In April 2024, Meta finally released Llama3. This open AI model performed well in the competition, surpassing the open options from Mistral, but the data used to train the model-these data were allegedly obtained by Zuckerberg. The approval is facing review by multiple lawsuits.

FAQ

Who is the AI course suitable for?

AI courses are suitable for people who are interested in artificial intelligence technology, including but not limited to students, engineers, data scientists, developers, and professionals in AI technology.

How difficult is the AI course to learn?

The course content ranges from basic to advanced. Beginners can choose basic courses and gradually go into more complex algorithms and applications.

What foundations are needed to learn AI?

Learning AI requires a certain mathematical foundation (such as linear algebra, probability theory, calculus, etc.), as well as programming knowledge (Python is the most commonly used programming language).

What can I learn from the AI course?

You will learn the core concepts and technologies in the fields of natural language processing, computer vision, data analysis, and master the use of AI tools and frameworks for practical development.

What kind of work can I do after completing the AI course?

You can work as a data scientist, machine learning engineer, AI researcher, or apply AI technology to innovate in all walks of life.