Midjourney's new research: Improving LLM's creative text generation capabilities

Author: LoRA Time: 25 Mar 2025 408

Midjourney recently collaborated with New York University to release a research on training text generation large language models (LLMs). The research aims to enhance LLM's ability to write creatively, enabling it to generate more creative texts. Midjourney proposed two new technologies: "Diverent Direct Preference Optimization" (DDPO) and "Diverent Odds Ratio Preference Optimization" (DORPO), to expand the scope of text generated by AI models so that it presents more diverse content while maintaining coherence and readability.

Researchers note that although current LLMs perform well in areas such as factual Q&A or code assistance, there should be multiple valid responses to the same prompt due to their openness. However, instruction-tuned LLMs tend to converge to similar storylines and themes. To address this problem, Midjourney’s research team improved the existing preference optimization method and introduced DDPO and DORPO. The core of these two innovations is to guide model training using "deviation"—the degree to which one response differs from others.

Experimental results show that DDPO is significantly better than standard DPO while maintaining output quality. The DDPO-equipped Llama-3.1-8B strikes the best balance between quality and diversity, and the responses it generates are more diverse than GPT-4o while maintaining good coherence. Even with the shrinking of the data set, the DDPO model can still maintain a certain diversity.

This research has important practical significance for companies that need to use AI to generate creative texts. For example, in the fields of marketing copywriting, corporate storytelling, and film and television game script creation, it is crucial to improve the diversity and quality of AI-generated content. Midjourney's research provides a new way of thinking to solve this problem.

In the future, integrating bias-based learning methods into enterprise AI models to enhance responsive diversity in customer-oriented applications, exploring the application of these methods in other generation tasks such as poetry, script creation or game stories, and developing a hybrid training method that balances diversity and instructional compliance capabilities will be a research direction worth looking forward to.

Midjourney's research team plans to disclose its code, which will undoubtedly provide valuable resources for developers who want to apply these technologies. By adopting these innovative technologies, the AI team is expected to break through the rigid and formulaic output model and build an AI system that is not only intelligent, but also truly imaginative.

Paper: https://huggingface.co/papers/2503.17126

Tips & Information

Midjourney's new research: Improving LLM's creative text generation capabilities

ReasonGraph: LLM Inference Visualization and Analysis Tool

OpenAI releases GPT-4o image generation model, supporting multiple rounds of dialogue editing functions

Use mCP and Cursor to improve AI coding efficiency and make development more efficient!

Tongfudun AI Agent Trust System Construction Declaration: From AI to IA, those who win the Agent will win the world

Homework helps to upgrade the programming course system, deeply integrates AI knowledge to help young people

Claude AI New Thinking Tool: AI can also "think twice before acting"

iQiyi sues minimax artificial intelligence model for suspected copyright infringement claim of 100,000 yuan

AMD releases GAIA open source project to help local large language models run efficiently