SALMONN is a large language model (LLM) developed by the Department of Electronic Engineering of Tsinghua University and ByteDance that supports speech, audio events and music input. Unlike models that only support speech or audio event input, SALMONN can perceive and understand a variety of audio inputs, enabling emerging capabilities such as multilingual speech recognition and translation, and audio-speech co-reasoning. This can be seen as giving LLM "hearing" and cognitive hearing capabilities, making SALMONN a step towards an artificial general intelligence with hearing capabilities.
Demand group:
" SALMONN can be used in speech recognition, speech translation, audio processing and other fields."
Example of usage scenario:
Input: gunshots.wav, Output: ...
Input: duck.wav, output: ...
Input: music.wav, output: ...
Product features:
Multilingual speech recognition
Multilingual voice translation
Audio-speech co-reasoning
AI tools are software or platforms that use artificial intelligence to automate tasks.
AI tools are widely used in many industries, including but not limited to healthcare, finance, education, retail, manufacturing, logistics, entertainment, and technology development.?
Some AI tools require certain programming skills, especially those used for machine learning, deep learning, and developing custom solutions.
Many AI tools support integration with third-party software, especially in enterprise applications.
Many AI tools support multiple languages, especially those for international markets.