TangoFlux
TangoFlux generates high-quality audio in seconds, optimized with CRPO, and is fully open-source for researchers and creators.
What is TangoFlux?
TangoFlux is an advanced text to audio generation model with 515M parameters. It can generate high quality 30-second 44.1kHz audio in just 3.7 seconds on a single A40 GPU. It uses the CLAP-Ranked Preference Optimization framework to enhance audio alignment and achieves state-of-the-art performance. The code and models are open source.