PhotoDoodle AI: Turn photos into works of art with a small number of tips

Author: LoRA Time: 26 Feb 2025 827

ByteDoing, a new AI image editing system launched by PhotoDoodle, a research team from China and Singapore, is redefining our understanding of image creation. This innovative technology based on the Flux.1 model enables learning artistic style from a small number of samples and accurately execute specific editing instructions, opening up new possibilities for creative expression.

Based on Flux.1

At the heart of PhotoDoodle is the OmniEditor system first developed by the research team, which cleverly improved the Flux.1 image generation model of German startup Black Forest Labs using LoRA (low-rank adaptive) technology. This approach does not require a thorough reshaping of the original model’s weights, but rather the ability to adjust from tiny concepts to full style transitions by adding dedicated small matrices.

The researchers then trained OmniEditor with a variant called EditLoRA, allowing it to replicate a unique artistic style. Through selected pairs of images created in collaboration with artists, the system is able to grasp the subtleties of each artistic style.

PhotoDoodle adds interesting elements such as monsters, magic effects and decorative illustrations while retaining the original image composition. | Photo: Huang et al.

"Position encoding cloning": Keep the picture harmonious and unified

PhotoDoodle's most eye-catching innovation is the "position coding cloning" technology. This technology enables AI to remember the exact location of each pixel in the original image, thus maintaining the integrity of the picture composition when adding new elements and ensuring that newly added elements naturally blend into the background.

This solves the key pain points of traditional image editing AI: either changing the entire image style or editing only local areas, making it difficult to incorporate new decorative elements while maintaining the original perspective and background. PhotoDoodle can achieve this breakthrough without additional parameter training, greatly improving processing efficiency.

PhotoDoodle converts daily photos with various art styles - from cute cartoon monsters to hand-painted lines and color effects. | Photo: Huang et al.

Outlook single image training

In actual testing, PhotoDoodle easily deals with complex instructions from "making the cat whiter" to "adding a pink monster climbing up a building." Compared with the prior art, it performs excellently in benchmarks such as image-text description similarity, far exceeding its peers whether targeted editing or global image changes.

Comparing PhotoDoodle with existing AI image editing systems can clearly show that there are differences in the execution quality of specific prompts. | Photo: Huang et al.

Currently, PhotoDoodle requires dozens of pairs of images and thousands of training steps to master the new style. The research team has turned its attention to more efficient single-image training methods and released a dataset containing six different art styles and more than 300 pairs of images. The relevant code has also been open sourced on GitHub, providing a solid foundation for future research.

Address: https://github.com/showlab/PhotoDoodle

Tips & Information

PhotoDoodle AI: Turn photos into works of art with a small number of tips

Based on Flux.1

"Position encoding cloning": Keep the picture harmonious and unified

Outlook single image training

Tesla announces launch of universal AI fully autonomous driving solution

Hugging Face acquires Pollen Robotics to enter the field of open source robot hardware

GPT-4.1 model unveiled! Cursor and Windsurf help developers encode more efficiently

OpenAI future model access will require authentication: Improve security and compliance