Open source for only 12 days, Tongyi Qianwen has won a number of large model authoritative evaluatio

Mondo Sports Updated on 2024-01-29

**|Tech Planet

text|Jia Ningyu

Since Alibaba Cloud announced open source on December 1, the Tongyi Qianwen 72b model has opened the "slaughter list" mode, and has won many authoritative rankings one after another. Today, Tongyi Qianwen won another important list champion.

On December 12, China's authoritative large model evaluation platform opencompass recently updated the list, and Alibaba Cloud Tongyi Qianwen topped the list of open source pedestal large models, and won the top two in the evaluation of Chinese datasets.

Caption: Tongyi Qianwen 72B topped the OpenCompass pedestal model list.

OpenCompass is an open-source large model evaluation platform of Shanghai Artificial Intelligence Laboratory, which participates in the evaluation of open source models such as QWEN and LLAMA2 and mainstream models such as GPT-4 and ChatGPT, which can comprehensively evaluate the capabilities of large models, and is one of the most authoritative Chinese ability evaluation lists recognized by the industry.

Tongyi Qianwen 72B open source model (QWEN-72B), to 67The comprehensive score of 1 won the first place in the list of OpenCompass pedestal large models, and surpassed the benchmark GPT-4 in the evaluation of subject ability and comprehension ability, setting a new record for open source large models.

In the evaluation of the opencompass Chinese dataset, the Tongyi Qianwen 72b base model and the dialogue model (qwen-72b-chat) took the top two, opening up a gap with other models.

Figure note: Tongyi Qianwen 72b base model and dialogue model occupy the top two tests of Chinese datasets.

Just a few days ago, Tongyi Qianwen beat LLAMA2 and other domestic and foreign open source large models, and topped the latest open source large model ranking of Huggingface, the world's largest open source large model community.

HuggingFace is the world's most influential AI open source community, and its Open LLM Leaderboard is considered to be the most credible professional list, including hundreds of open source models around the world, such as the QWEN series and Llama2.

The open-source Tongyi Qianwen (QWEN-72B) performed eye-catchingly, with a score of 73The overall score of 6 ranks first among all pre-trained models. It has refreshed the record of Chinese large models on the HuggingFace list.

Caption: Tongyi Qianwen 72B topped the HuggingFace rankings.

Tongyi Qianwen 72b has become recognized as the most powerful open source large model at home and abroad, which can fully meet the high requirements of enterprise and scientific research applications for large model performance.

Previously, on December 1, when the open source was announced, QWEN-72B won the best score of the open source model in 10 authoritative benchmark evaluations, surpassing LLAMA2-70B, and surpassing the closed-source GPT-3 in some evaluations5 and GPT-4.

Caption: Some of the results of the 72 billion open source model of Tongyi Qianwen surpass the closed-source GPT-35 and GPT-4.

Specifically, on the English task, QWEN-72B achieved the highest score in the open source model in the MMLU benchmark;In the Chinese task, QWEN-72B dominates the C-Eval, CMMLU, Gaokaobench and other benchmarks, and the score exceeds GPT-4;In terms of mathematical reasoning, QWEN-72B is ahead of other open-source models in the GSM8K and MATH evaluation interrupt layersIn terms of comprehension, QWEN-72B's performance in Humaneval, MBPP and other assessments has been greatly improved, and its ability has made a qualitative leap.

It is understood that Alibaba Cloud has open-sourced 4 large language models with 1.8 billion, 7 billion, 14 billion, and 72 billion parameters of Tongyi Qianwen, as well as 2 multimodal large models of visual understanding QWEN-VL and audio understanding QWEN-audio, leading the "full-scale, full-modal" open source.

Up to now, the total number of Tongyi Qianwen open source model series has exceeded 1.5 million, and more than 150 new models and new applications have emerged.

Zhou Jingren, CTO of Alibaba Cloud, once said that the open source ecosystem is crucial to promoting the technological progress and application of China's large models, and Tongyi Qianwen will continue to invest in open source, hoping to become "the most open large model in the AI era" and work with partners to promote the ecological construction of large models.

Developers can directly experience the effects of the series of models in the Alibaba Cloud Magic Community, or call the model API through the Alibaba Cloud Lingji platform, or customize large model applications based on the Alibaba Cloud Bailian platformAlibaba Cloud AI platform PAI also deeply adapts to the full range of models of Tongyi Qianwen, and launches services such as lightweight fine-tuning, full-parameter fine-tuning, distributed training, offline inference verification, and first-class service deployment.

Related Pages