InternVL2_5-2B is a powerful multi-modal model integrating image and text for advanced applications like product descriptions and visual question answering.
Zonos-v0.1 is a high-fidelity real-time TTS model with 1.6B parameter Transformer and Hybrid architectures supporting multiple languages and flexible voice adjustments for natural expression.
olmOCR converts complex PDFs into structured data for LLM training, supporting text parsing, language filtering, and multi-version comparison for efficient document processing.
SmolLM2-1.7B is a lightweight language model optimized for device deployment with advanced capabilities in instruction-following knowledge reasoning and math problem-solving across diverse tasks.
Sana generates high-resolution images from text descriptions quickly and accurately for designers artists and researchers with open-source code and multi-language support