Current location: Home> Ai News

AI CUDA Engineer Releases: PyTorch Performance Improves 10 to 100x

Author: LoRA Time: 21 Feb 2025 1029

Japanese artificial intelligence startup Sakana AI announced the launch of "AI CUDA Engineer", an innovative AI agent system designed to automate the production of highly optimized CUDA cores, significantly improving the operational efficiency of machine learning operations. According to the latest news on the X platform, the system has increased the running speed of common PyTorch operations by 10 to 100 times through the evolution of large language model (LLM)-driven code optimization technology, marking a major breakthrough in AI technology in the field of GPU performance optimization .

Sakana AI said that as the core of GPU computing, direct writing and optimization usually requires deep expertise and high technical barriers. Although existing frameworks such as PyTorch are easy to use, they often cannot match manually optimized kernels in performance. "AI CUDA Engineer" solves this problem through an intelligent workflow: it not only automatically converts PyTorch code into efficient CUDA cores, but also performs performance tuning through evolutionary algorithms, and can even integrate multiple cores to further improve operation time efficiency.

QQ20250221-172514.png

X user @shao__meng likens this technology to "installing an automatic transmission for AI development", allowing ordinary code to "automatically upgrade to racing-level performance". Another user @FinanceYF5 also pointed out in the post that the launch of the system demonstrates the potential of AI self-optimization and may bring a revolutionary improvement to the efficiency of future computing resource use.

Sakana AI has previously made its mark in the industry due to projects such as "AI Scientist". The release of "AI CUDA Engineer" further highlights its ambitions in the field of AI automation. The company claims that the system has successfully generated and verified more than 17,000 CUDA cores, covering multiple PyTorch operations, and the exposed dataset will provide valuable resources for researchers and developers. Industry insiders believe that this technology not only lowers the threshold for high-performance GPU programming, but also may push the training and deployment efficiency of artificial intelligence models to a new level.

Information reference: https://x.com/FinanceYF5/status/1892856847780237318