Яндекс Метрика
Распознавание речи

EVI 3

Hume
Речь в текстРаспознавание речиSpeech synthesisText-to-speech (TTS)Retrieval-augmented generationAudio question answeringSpeech-to-speech

EVI 3 от Hume — это революционная ИИ-модель, которая объединяет понимание текста и синтез эмоциональной речи в одном «мозге». Нейросеть способна имитировать любой голос и характер, обеспечивая беспрецедентный уровень эмпатии и реализма в голосовом общении.

Today, we’re excited to introduce Hume’s third-generation speech-language model, EVI 3. As a speech-language model, where the same intelligence handles transcription, language, and speech, EVI 3 brings more expressiveness, realism, and emotional understanding to voice AI. And instead of being limited to a handful of speakers, EVI 3 can speak with any voice and personality you create with a prompt. EVI 3 streams in user speech and forms natural, expressive speech and language responses. At conversational latency, it produces the same quality of speech as our text-to-speech model, Octave. Simultaneously, it responds with the same intelligence as the most advanced LLMs of similar latency. It also communicates with reasoning models and web search systems as it speaks, “thinking fast and slow” to match the intelligence of any frontier AI system. In a blind comparison with OpenAI’s GPT-4o, EVI 3 was rated higher, on average, on empathy, expressiveness, naturalness, interruption quality, response speed, and audio quality.

Что такое EVI 3?+
Кто разработал EVI 3?+
Какие задачи решает EVI 3?+