Current location: Home> Ai News

Gemini AI will support MP4 and other video formats, and the file analysis function is about to be upgraded.

Author: LoRA Time: 08 Apr 2025 1048

For users using Google Gemini AI, an update worth paying attention to is on the way: Gemini is about to expand its file analysis capabilities and add support for 8 common video formats. This change means that Gemini will officially enter the application scenario of video processing from the analysis of text and tables.

Gemini AI.jpg

The exposure of this feature comes from Android Authority's APK code analysis of Google App 16.13.38 beta. According to the code snippets mined, Gemini's file processing capabilities are significantly enhanced and will support mainstream video formats including MP4, AVI, 3GP, FLV, MOV, MPEG, MPG and WEBM in the future.

What does richer file support mean?

In the past, Gemini's file analysis mainly focused on structured data such as text, code, and tables. Video content analysis is a link with high technical barriers and high resource requirements for most AI users, especially content creators, teachers, developers and researchers.

The new video format support this time means that users can directly upload video files to Gemini in the future, and AI will assist in extracting information, generating summary, and even identifying semantic content, thereby making new breakthroughs in work efficiency and intelligent interaction.

Supported format list

Judging from the fields found in the current beta, Gemini will be compatible with the following video types:

  • 3GP

  • AVI

  • FLV

  • MOV

  • MP4

  • MPEG

  • MPG

  • WEBM

This covers the current mainstream high-definition audio and video standards, from early mobile video formats to the current mainstream, and can basically meet most of the needs of individual users and content teams.

Upload restrictions and permission mechanisms

According to the string information in the code, Gemini may have limitations on the total duration of a user uploading videos, for example:

“The total duration of your video must be within one hour.”
“The total duration of your video must be no more than X minutes.”

This indicates that the system may set an upload limit based on different user permissions (such as free accounts and premium subscription users). Although Google has not officially announced specific rules, this requires additional attention for users who upload or process videos on a large scale.

What possibilities are there in the future?

In addition to video processing, Android Authority also found a noteworthy field in the same version: "GitHub Attachment Type". This means that Gemini may support direct parsing of content in GitHub repository in the future.

If this function is implemented, it will greatly improve the developer's code review efficiency. For example, you can submit a link to an open source project to Gemini, which can help you analyze the code structure, identify potential problems, and even summarize project documents.

Why does this make sense to users? For AI beginners and technology enthusiasts, this upgrade not only means more format support, but also an important signal that Gemini is moving towards a "multimodal AI assistant". Here are some practical scenario examples:

  • Content creators can let AI help organize clip content or generate video script summary

  • Educators can upload teaching videos, extract key points from AI or automatically generate Q&A

  • Developers and researchers can interact with code repositories and multimedia content more easily

At the privacy and efficiency level, the combination of localization support or edge processing is also expected to make future Gemini applications safer, faster and more practical.

Summarize

Google is quietly injecting stronger multimedia understanding into Gemini. From text expansion to video to code repositories, Gemini's boundaries are expanding. For users, this means that more work can be automatically completed by AI, saving time, lowering technical thresholds, and obtaining more professional feedback and analysis.

We recommend paying attention to the official launch time of this feature and trying or adjusting usage strategies in a timely manner according to your own needs. If you are interested in the application of AI in multimedia scenarios, this upgrade is worthy of your attention.