Groundlight open source AI framework: Innovating complex visual reasoning technology

Author: LoRA Time: 17 Mar 2025 932

The Groundlight research team recently opened up a new AI framework, aiming to solve complex visual reasoning problems in the visual field, so that AI can not only recognize images, but also conduct deeper reasoning. Current visual language models (VLMs) perform poorly when understanding images and combining visual and text cues for logical reasoning. To this end, the research team adopted reinforcement learning methods and innovatively used GRPO (Gradient Ratio Policy Optimization) to improve learning efficiency.

To verify the method, the researchers designed a password-breaking task, which requires the model to interpret the encoded information using randomly generated decoder images. The results show that a model with only 3 billion parameters has achieved an accuracy rate of 96%. GRPO optimizes the learning process by comparing multiple outputs, improving training stability. The research also proposes techniques such as selective model upgrade and ensemble pre-trained models to enhance inference capabilities without significantly increasing computational overhead.

Project: https://github.com/groundlight/r1_vlm

demo: https://huggingface.co/spaces/Groundlight/grpo-vlm-decoder

Tips & Information

Groundlight open source AI framework: Innovating complex visual reasoning technology

Peking University team proposed LIFT framework: inject long context knowledge into model parameters

Mihayou's new AI game "Whispering Stars" is open to open, and we will live with our AI girlfriend

FF established a subsidiary of Future AIHER to lay out the field of AI hybrid electric drive system

Anthropic may release Claude3.7Sonnet Max? Cursor update triggers speculation

OpenAI predicts: AI will surpass human programmers by the end of 2025

Kai-Fu Lee launches the Magic Enterprise Big Model Platform to help enterprise AI applications

AI independently wrote papers to deceive experts to review, Sakana AI voluntarily withdrew

DingTalk launches AI customer service assistant to automatically access the company's official website and public account