Say goodbye to voice cloning infringement! Hume AI launches voice control feature to create personalized AI voices

Author: LoRA Time: 23 Dec 2024 222

Hume AI, a startup focusing on emotionally intelligent voice interfaces, recently launched an experimental feature called "voice control."

This new tool is designed to help developers and users create personalized AI sounds without any coding, AI prompt engineering or sound design skills. Users can easily customize the sound to suit their needs by precisely adjusting the sound characteristics.

This new feature builds on the company’s previously launched Empathic Voice Interface 2 (EVI2), which enhanced the naturalness, emotional responsiveness and customizability of speech. Unlike traditional voice cloning technology, Hume's products focus on delivering unique and expressive voices to meet the needs of applications as diverse as customer service chatbots, digital assistants, teachers, tour guides, and accessibility features.

Voice control allows developers to adjust voice characteristics along ten different dimensions, including gender, assertiveness, excitement, confidence, and more.

“Male/Female: Gendered vocalizations that range between more masculine and more feminine.

Confidence: The firmness of the voice, between timidity and boldness.

Buoyancy: The density of sound, ranging between deflation and buoyancy.

Confidence: The degree of certainty in the voice, somewhere between shy and confident.

Enthusiasm: Excitement in the voice, somewhere between calm and enthusiasm.

Nasal: The openness of the voice, ranging between clear and nasal.

Relaxation: The pressure in the voice, between tension and relaxation.

Smoothness: The texture of the sound, somewhere between smooth and staccato.

Mildness: The energy behind the sound, somewhere between gentle and powerful.

Tightness: How contained the sound is, ranging between tight and breathless. "

Users can fine-tune these properties in real time via virtual sliders, making customization simple and straightforward. This feature is currently available in Hume's virtual platform, and users can access it by simply registering for free.

Voice control is currently available in beta and integrates with Hume's Empathic Voice Interface (EVI), making it available for a wide range of applications. Developers can select a base voice, adjust its characteristics, and preview the results in real time. This process ensures repeatability and stability from session to session, which is a key feature of real-time applications such as customer service bots or virtual assistants.

The impact of EVI2 is evident in the voice control functionality. Early models introduced features such as conversational prompts and multi-language capabilities that broadened the scope of voice AI applications. For example, EVI2 supports sub-second response times for natural, instant conversations. It also allows speaking styles to be dynamically adjusted during interactions, making it a versatile tool for businesses.

This move is precisely to solve the problem of dependence on preset sounds in the AI industry. Many brands or applications often have difficulty finding sounds that meet their needs. Hume's goal is to develop emotionally sensitive voice AI and promote industry progress. When EVI2 is released in September 2024, it will already significantly improve the latency and cost-effectiveness of voice and provide a secure alternative to voice adjustment functions.

Hume's research-driven approach is at the heart of product development, combining cross-cultural voice recordings and emotional survey data. This methodology forms the basis of EVI2 and the newly launched voice control, allowing it to capture human perception of sound in minute detail.

Currently, voice control has been launched in the beta version and is combined with Hume’s Empathic Voice Interface (EVI) to support a variety of application scenarios. Developers can select a base sound, adjust its characteristics, and preview the results in real time, ensuring consistency and stability in real-time applications such as customer service or virtual assistants.

As competition intensifies in the market, Hume's personalized voice and emotional intelligence positioning makes it stand out in the voice AI field. In the future, Hume plans to expand the functions of voice control, add adjustable dimensions, optimize sound quality, and increase the selection of basic sounds.

Official blog: https://www.hume.ai/blog/introducing-voice-control

Highlight:

? **Hume AI has launched a "voice control" function, allowing users to easily create personalized AI voices. **

?️ ** This feature requires no coding skills, and users can adjust the sound characteristics through sliders. **

? **Hume is designed to meet diverse application needs through personalized and emotionally intelligent voice AI. **

FAQ

Who is the AI course suitable for?

AI courses are suitable for people who are interested in artificial intelligence technology, including but not limited to students, engineers, data scientists, developers, and professionals in AI technology.

How difficult is the AI course to learn?

The course content ranges from basic to advanced. Beginners can choose basic courses and gradually go into more complex algorithms and applications.

What foundations are needed to learn AI?

Learning AI requires a certain mathematical foundation (such as linear algebra, probability theory, calculus, etc.), as well as programming knowledge (Python is the most commonly used programming language).

What can I learn from the AI course?

You will learn the core concepts and technologies in the fields of natural language processing, computer vision, data analysis, and master the use of AI tools and frameworks for practical development.

What kind of work can I do after completing the AI course?

You can work as a data scientist, machine learning engineer, AI researcher, or apply AI technology to innovate in all walks of life.