iFLYTEK Xinghuo released a large voice model with rhythm, timbre and emotion

Chao News client reporter Gan Jupeng.

On January 30, iFLYTEK held the Spark Cognitive Model V35. Upgrade press conference. Liu Qingfeng, Chairman of iFLYTEK, and Liu Cong, Dean of the Research Institute, officially released the iFLYTEK Xinghuo V3. based on the first national computing power training5. The seven core competencies have been comprehensively improved, the mathematics, language comprehension, and voice interaction capabilities have surpassed GPT-4 Turbo, and the Spark Smart Blackboard has been upgraded; The first batch of 37 mainstream languages surpassed OpenAI Whisper V3, enabling iFLYTEK Translator to usher in a new upgrade to help communicate more freely, and promote the human-computer interaction transformation of customer service, automobiles, robots and other scenarios in the era of Internet of Everything.

The implementation of large-scale model applications has accelerated, and the Spark ecosystem has grown rapidly with more than 350,000 developers, creating personal applications to empower hundreds of millions of users; iFLYTEK Xinghuo empowers thousands of industries, and joins hands with leading enterprises in insurance, banking, energy, automobiles, communications and other fields to create an application benchmark empowered by large models. In addition, the iFLYTEK Xinghuo open source model "Xinghuo Open Source-13B", which is deeply adapted to domestic computing power, was released for the first time, with leading scene application effects, and the Shengsi open source community jointly launched the first launch.

Through this press conference, we look forward to a spring full of hope and growth energy. I believe that in 2024, we will be able to achieve a spark of fire, and general artificial intelligence will not only be able to be deeply and widely used in China's major fields, but also we will stand on a new level in source technology innovation and the underlying capabilities of large models. Liu Qingfeng said.

The seven major capabilities have been comprehensively improved.

iFLYTEK Spark v35. Benchmark GPT-4

On October 24, 2023, iFLYTEK and Huawei announced the official launch of the first Vanka domestic computing power platform "Feixing-1" to support the training of trillion-parameter large models. In the more than 90 days after it was launched, iFLYTEK Xinghuo has been moving non-stop, and based on the "Feixing No. 1", it has launched a large-scale model training with larger parameters against GPT-4, bringing the iFLYTEK Xinghuo V3 on January 305. Upgrade release.

The first national open large model iFLYTEK Xinghuo v3 based on the training of national computing power5. It has been comprehensively upgraded in seven aspects: language comprehension, text generation, knowledge question answering, logical reasoning, mathematical ability, ability and multimodal ability. Among them, the language comprehension and mathematics ability exceeds that of GPT-4 Turbo, ** reaching 96% of GPT-4 Turbo, and multimodal understanding reaching 91% of GPT-4V.

In better data and stronger human-machine collaborative training, we cannot only look at the ability of a single 'atom', but use technological progress to solve the rigid needs of the real world. ”

How can technological advances lead to truly effective solutions for human life? Liu Qingfeng showed the iFLYTEK Spark V3 to the audience from three aspects: new empowerment of human-computer interaction in the era of Internet of Everything, new empowerment of knowledge learning and content creation, and new improvement of digital and intelligent productivity5. Ability improvement.

The large model newly empowers the human-computer interaction experience in the era of the Internet of Everything, and the super-anthropomorphic synthesis effect is amazing. iFLYTEK Spark v35 Not only in the demonstration of semantic understanding, instruction following, and multi-round dialogue, but also in emotional perception and anthropomorphic synthesis.

I heard that Erbin is particularly hot this year, and as a small potato in the south, I really want to play it. Why don't you introduce what's interesting in Northeast dialect? ”

In the practical demonstration session, Liu Cong, Dean of iFLYTEK Research Institute, and iFLYTEK Xinghuo V35. Live interaction, iFLYTEK Spark v35. Amuse the audience with an authentic Northeast dialect. In addition to humor, iFLYTEK Spark v35 quickly customized a travel strategy for Liu Cong, and urged him to buy tickets quickly, and tickets for the Spring Festival were in short supply.

It can not only help users bring solutions, but also bring emotional interaction as a friend who "knows cold and hot", and the ultra-high anthropomorphism makes the large model more humane.

The large model empowers knowledge learning and content creation. iFLYTEK Spark v35. Tasks such as year-end summary plan, debriefing PPT, activity planning, policy Q&A are "at your fingertips". Based on this, iFLYTEK has launched an office product that can quickly and automatically generate documents and PPT with one click - iFLYTEK Zhiwen, the main functions of this product include one-click document generation, AI writing assistant, multilingual document generation, AI automatic drawing, a variety of template selection, and speech remarks function. Liu Cong demonstrated the PPT of "Hefei 2024 Spring Festival Tourism Promotion Strategy" produced by iFLYTEK Zhiwen on the spot, and more than 20 pages of content-rich PPT were completed in one go in a short period of time.

The large model can also be reasonably expanded in combination with external knowledge, so as to achieve "side-by-side references". The progress of ability such as element extraction and problem generation can help everyone form a closed loop of thinking through testing and learning, and produce more easy-to-use agents in more and more service fields and learning knowledge places.

The new large-scale model improves the productivity of digital intelligence, which can better help improve the quality and efficiency of scientific research, industry and other areas that are just needed for people's livelihood. With the upgrading of mathematics and reasoning ability, multimodal ability is gradually advanced, iFLYTEK Xinghuo v35. Achieve "high score" response in visual question answering, associative reasoning, etc., with more accurate understanding and better expression.

iFLYTEK Spark v35. The improvement of ability has reached the key point of both quantity and quality. Liu Qingfeng said that in 2024, the application of the iFLYTEK Xinghuo cognitive model will definitely shine in more and more scenarios and fields.

The Spark voice model was officially released.

Continue to maintain the global leading level.

The dream and mission of iFLYTEK from the beginning of its establishment is to achieve barrier-free communication. It's been 25 years, and our goals and dreams haven't changed for a single day. ”

iFLYTEK, which started with intelligent voice, has been galloping all the way on this track since its inception 25 years ago, and continues to be at the forefront of the world. From 2006 to 2019, he won the championship of the International Speech Synthesis Competition for 14 consecutive years; From 2016 to 2023, he won the championship of the international multi-channel speech separation and recognition competition CHIME for 4 consecutive years; From 2021 to 2023, he won the championship ...... of the international voice translation competition IWSLT for three consecutive yearsIn addition, it also participated in the construction of the first batch of national new generation artificial intelligence open innovation platforms, the National Engineering Research Center for Speech and Language Information Processing, etc., and continued to accumulate in the field of speech.

Large models bring new opportunities for the development of voice technology. Liu Qingfeng emphasized that making machines have the ability to learn, reason and make decisions is the main work of cognitive models. "To put it simply, with the help of large models, we make a speech have richer attributes, including language, content, prosody, timbre, and emotion. ”

He introduced that the effect of the Spark voice model is internationally leading, and the speech recognition effect of the first batch of 37 mainstream languages such as Chinese, English, French, and Russian exceeds that of OpenAI Whisper V3, and in terms of multilingual speech synthesis, the first batch of 40 languages of the Spark voice model has an anthropomorphism of more than 83%.

Through the evaluation effect of the Xinghuo voice model, we are very proud to tell you that iFLYTEK continues to maintain the world's leading level. ”

Under this advantage, the capability upgrade of the large voice model is also applied to C-end hardware products. At the meeting, Liu Qingfeng introduced the iFLYTEK translator equipped with a large voice model, which will soon launch two important functions of multilingual automatic recognition and enhanced translation, which will be upgraded at the end of January and mid-March this year. Multilingual automatic recognition makes international communication more convenient, and augmented translation technology turns the translator into an AI translation assistant. According to reports, the multilingual automatic recognition upgrade of iFLYTEK translator will support 35 languages to improve the quality and efficiency of cross-language communication; Augmented translation provides bilingual services in both English and Chinese, making cross-language communication more worry-free.

The Xinghuo voice model not only helps international communication, but also "versatile" more scenarios to empower practical applications. Liu Qingfeng introduced that in scenarios such as automobiles, customer service, families, and companion robots, the Xinghuo voice model has more places to play, bringing human-computer interaction changes. For example, empowering automobiles, the interactive experience of intelligent cockpit, intelligent cockpit, intelligent navigation, and first-class control will be further optimized; Industries such as companion robots, shopping guide robots, auxiliary diagnosis robots, smart homes, and wearable devices will also be further detonated with the empowerment of voice models.

*Please indicate the source".

iFLYTEK Xinghuo released a large voice model with rhythm, timbre and emotion

Related Pages

iFLYTEK's new generation of Xinghuo Smart Blackboard was released, known as the "AI assistant" for t

Wenxin Yiyan VS iFLYTEK Xinghuo VS chatgpt(148) Introduction to Algorithms 12 2 3 Questions

Wenxin Yiyan VS iFLYTEK Xinghuo VS chatgpt(156) Introduction to Algorithms 12 3 3 Questions

Wenxin Yiyan VS iFLYTEK Xinghuo VS chatgpt(157) Introduction to Algorithms 12 3 4 Questions

Rush me!iFLYTEK AI Mouse AM50 Spark Large Model Voice input