Яндекс Метрика
Мультимодальная модель, Языковая модель, Компьютерное зрение, Распознавание речи

NOMI GPT

NIO
Object recognitionAudio question answeringУправление системамиСледование инструкциямРаспознавание речиГенерация текстаОтветы на вопросыAudio classificationText classification

NOMI GPT превращает бортовой компьютер электромобилей NIO в полноценный «когнитивный центр» на базе ИИ. Мультимодальная модель мастерски распознает объекты, понимает сложные голосовые команды и обеспечивает бесшовное взаимодействие водителя с системами автомобиля.

"Multimode recognition" is an important part of the "NOMI GPT cognitive center". In fact, "multi-mode denial" is no stranger to users. Since the launch of the NOMI continuous dialogue function, "multi-mode denial" has been online to ensure users a free and smooth interactive experience. At present, after continuous iteration, "multimode rejection" can already be found in Full cabin wake-free, continuous dialogue, large model encyclopedia dialogue "Waiting" scenarios provide NOMI with the ability to refuse recognition. However, with the enhancement of the encyclopedia capabilities of the "NOMI GPT Large Model", NOMI has a richer knowledge reserve and can answer more questions, which means that "multi-mode rejection" requires questions in a wider field. Listening and identification put forward higher requirements for its judgment ability. How can "multi-mode denial" accurately determine the direction of dialogue and user intention? The actual cockpit scene is very complex, including both conventional vehicle control command/mission dialogue scenes and a wide range of encyclopedia question and answer scenes. It is extremely challenging to identify the user's target, judge the user's intention and give a correct response, which is a great test. The scene discrimination ability of the "multi-mode denial" system. In the "multi-mode rejection" system, we passed "Large model + multi-mode perception" "technical solution" to achieve scene identification. The self-developed "multi-mode recognition rejection" model directly determines speech commands NIO has developed a "multi-mode rejection" model based on voice and text to help NOMI determine which conversations are user instructions and which conversations are user chats. We use "Voice pre-training model Wav2Vec" And "Text pre-trained model TinyBert" Come to "model" and jointly pre-train the NOMI "multi-mode recognition rejection" model. At the same time, we will also let NOMI do multi-view contrast learning to help NOMI

Что такое NOMI GPT?+
Кто разработал NOMI GPT?+
Какие задачи решает NOMI GPT?+