Where are the domestic large models going?

Mondo Entertainment Updated on 2024-01-28

Industrialization is the key.

Text丨Heike Finance Fan Dongcheng.

The field of AI has added to the waves.

On December 7, Google, which has been in the making for a long time, officially released the Gemini multimodal large model.

Official announcement gemini 1Version 0 includes Gemini Ultra, Gemini Pro, Gemini Nano these 3 different sizes, Gemini Nano is mainly used on the device side, Gemini Pro is suitable for scaling in various tasks, the most powerful Gemini Ultra is still undergoing trust and safety checks, fine-tuning and reinforcement Xi based on human feedback, and is expected to be launched to developers and enterprise customers in early 2024.

Prior to this, IBM announced that it would cooperate with Meta and AMD, Intel, Oracle, Cornell University, Yale University, and the University of California, Berkeley, etc. to launch the "AI Alliance" to jointly support open AI innovation. Alvind Krishna, Chairman of IBM, said that IBM hopes that through the partnership, the AI Alliance can advance the innovative AI agenda based on safety, accountability and scientific rigor.

It is quite conspicuous that there is no Google and OpenAI, the company behind ChatGPT, in the list of cooperative institutions of the AI Alliance. Many people in the industry believe that this is a "huddle" to compete with giants.

The wave of large models set off by ChatGPT has already swept in. ChatGPT has been around for 1 year, and it is a "100-model war" at home and abroad. According to the "Innovative Application of Large Models in Beijing's Artificial Intelligence Industry", as of October 2023, there are a total of 254 large model manufacturers and universities and institutes with more than 1 billion parameters in China.

Among them, the development of open source large models is particularly eye-catching.

The first to sound the guns in China was Baichuan Intelligence, founded by Wang Xiaochuan, the former CEO of Sogou. In June 2023, Baichuan Intelligent released the 7 billion parameter open-source language model baichuan-7b that can be commercialized for free and commercially, and one month later, it released the 13 billion parameter language model baichuan-13b and the dialogue model baichuan-13b-chat. In September, Baichuan Intelligent announced that it would open source the adjusted baichuan2-7b, baichuan2-13b, baichuan2-13b-chat and their 4bit quantitative versions.

Another big open source player is Alibaba Cloud.

Since August 2023, Alibaba Cloud has open-sourced the 7 billion parameter general model QWEN-7B, the dialogue model QWEN-7B-Chat, the visual language model QWEN-VL, the 14 billion parameter model QWEN-14B, and its dialogue model QWEN-14B. On December 1, Alibaba Cloud announced that it would open source the 72 billion parameter model QWEN-72B, along with the 1.8 billion parameter model QWEN-18b and the audio model qwen-audio.

So far, the scale of Tongyi Qianwen's open source parameters covers 1.8 billion, 7 billion, 14 billion and 72 billion, plus two multimodal models for visual understanding and audio understanding, which can be said to have achieved "full-scale, full-modal" open source.

Alibaba Cloud officially announced that it will be "the most open cloud in the AI era", and it is natural to bet on ecological construction with an open source model, while Tongyi Qianwen is drawing a new landing picture with its own iteration and evolution.

There has long been a consensus in the industry that open source and closed source of large models have their own long boards.

Open source can bring abundant resources and feedback, so that the large model can accelerate iteration and quickly form an ecosystem, such as LLAMA and LLAMA 2 launched by Meta, Tongyi Qianwen open source "Family Bucket", ChatGPT2-6B launched by Zhipu AI and Tsinghua KEG Lab, and some large models of Baichuan are all in this list.

Closed source can better protect the core technology of enterprises, so as to provide more unique commercial solutions and services, such as ChatGPT, Wenxin Yiyan, Baichuan 53B, etc.

Taking LLAMA as an example, after its launch in February 2023, it has led to a large number of AI companies and institutions: Stability AI has launched Stable Chat similar to ChatGPT, which is based on the open-source language model Stable Beluga, which is fine-tuned by Llama;Alpaca launched by Stanford University and Vicina led by the University of California, Berkeley, are both based on the open source model of LLAMA.

Openness, inclusiveness, and ecological development are the meanings of open source.

Tongyi Qianwen, which is also open source to the level of 70 billion parameters like LLAMA 2, is also on par with it in terms of influence. After QWEN-7B was open-sourced, it quickly rushed to the trending list of HuggingFace and GitHub.

According to the data released at the Apsara Conference on November 1, 2023, Alibaba Cloud's AI large model open-source community has gathered more than 2,300 models, attracting more than 2.8 million developers, and the number of models has exceeded 100 million. Users can directly experience the effects of the QWEN series models in the Moda community, or call the model API (Application Programming Interface) through the Alibaba Cloud Lingji platform, or customize large model applications based on the Alibaba Cloud Bailian platform.

More importantly, QWEN-72B is trained on 3T tokens high-quality data, and has won the best results of the open-source model in 10 authoritative benchmark evaluations, all of which are better than LLAMA2-70B, and some evaluations surpass ChatGPT-35 and ChatGPT-4.

In the English task, QWEN-72B achieved the highest score for the open source model in the MMLU benchmarkIn terms of Chinese tasks, QWEN-72B scored better than GPT-4 in benchmarks such as C-Eval, CMMLU, and GaokaobenchIn terms of mathematical reasoning, QWEN-72B is ahead of other open-source models in the GSM8K and MATH evaluation interrupt layersLooking at the ** understanding again, the performance of QWEN-72B in Humaneval, MBPP and other assessments has been greatly improved, and the ** ability has made a qualitative leap.

The complex semantic understanding of Chinese is a typical case. Phrases involving different meanings of "meaning", "not enough", "really interesting", "sorry", etc., can be used to accurately analyze the meaning of each phrase in the sentence or paragraph, for example, "not enough" may mean that the other party's gift is not rich enough, "small meaning" means modesty, and "sorry" means apologize.

For logical reasoning questions, Tongyi Qianwen can develop hypotheses to explain the answers. For example, the classic "two doormen" logic problem is how to get the correct answer to which door is the right one from a truth-telling doorman and a doorman who lies by asking a question once. After answering the question of "If I ask another doorman, which door will the other doorman say is correct", the large model assumes the situation of asking the truthful doorman and the fake doorman, respectively, and fully expresses the logic of the answer.

QWEN-72B can handle up to 32k long text inputs, surpassing ChatGPT-3 on the long text comprehension test set LeVal5-16k effect. QWEN-72B's skills such as instruction adherence and tool usage have been optimized, allowing it to be better integrated with downstream applications. Moreover, QWEN-72B is equipped with powerful system command capabilities, and users only need to use a prompt word to customize the AI assistant.

According to Heike Finance and Economics, if you enter "Sister Leng Yan", the large model will give a tone such as "Say something quickly, don't waste my time" and "Give me a little respect".Asking for "two-dimensional cute girl", the large model will add a variety of symbolic expressions when answering, and the expression is very soft;Even naming film and television characters, such as Li Yunlong in "Bright Sword", the large model can also apply his way of speaking and classic lines to his reply.

The difference between open source and closed source routes is like the battle between iOS and Android for mobile phone operating systems, and Android has formed a unique ecology with its open source play method and achieved a high market share. Judging from the performance of Tongyi Qianwen, the open source model has taken an important step.

Open-source large models can help users simplify the process of model training and deployment.

Users don't have to train from scratch, they only need to pre-train the model and fine-tune it to quickly build a high-quality model. On the one hand, it lowers the threshold for all walks of life to enter the field of large models, and on the other hand, it can also enable specific industries to promote the progress of large model technology.

This is the case with Mindchat, which is applied to psychology scenarios in China.

MindChat is a psychological counseling tool, which can be said to be an AI psychological counselor, which can provide users with psychological assessment and other services in a convenient and timely manner. Users can confide in Mindchat if they have any worries or confusions, and they can even voice type. MindChat will empathize with the user, analyze the user's emotional and psychological state through text content and voice tone, and then give corresponding suggestions. These recommendations also include the need for real-world experts or psychologists.

In the words of Yan Xin, the developer of mindchat, he hopes to provide services with a simple and easy-to-use interface, so that lonely people can find emotional outlets and stay connected to society.

Yan Xin, who graduated with a bachelor's degree in 2023, is a member of the XD Lab of East China University of Science and Technology, and the team focuses on AI application development in the fields of social computing and psychological emotion. He found that psychological services are very suitable for large-scale models - there is a huge demand for such services in society, but the overall supply is scarce and often expensive, and large-scale model technology can make services inclusive. Today, MindChat has provided more than 1 million Q&A services to more than 200,000 people.

Yan Xin and his team have been tracking the development of large models in the open source field, and have previously tried large models such as ChatGPTM, Baichuan, and Internlm. After the launch of QWEN-7B and QWEN-14B, they used internal data and benchmarks to evaluate and determined that Tongyi Qianwen was the optimal solution in the open source model in this scenario, so they chose to use it as the base. In addition to MindChat, the team has also developed a large medical and health model based on Tongyi Qianwen (Sun Simiao) and an educational examination model Gradchat (Koi).

Yan Xin said that he and his team are staunch supporters of open source, so some of XD Lab's models are open source to feed back to the open source community, and the other part of the models suitable for real scenarios provide services in the form of closed-source APIs.

Tao Jia, an individual developer, also recognizes the adaptability of large models to specific scenarios.

Tao Jia works in Zhejiang Electric Power Design Institute of China Energy Construction Group, mainly responsible for macro analysis, planning research and early optimization of new power systems and comprehensive energy. He said that from the perspective of the industry, the application prospects of large models in the field of electric power range from the initial domain knowledge question and answer system to the high-level mathematical optimization of power dispatching, which are worth exploring. Therefore, he tried to use the open source model of Tongyi Qianwen to build a document Q&A-related application.

Scenarios in the power sector are quite specific, and it is often necessary to find content from hundreds of thousands or even millions of words of documents. Tao Jia used Tongyi Qianwen to make a retrieval Q&A application based on a private knowledge base, that is, given an English document, tell the large model what it needs to find, and let the large model answer which directory has the answer according to the document directory.

Document retrieval and interpretation in the professional field require high content accuracy and logical rigor. Tao Jia said that among the open source models he tried, Tongyi Qianwen worked best, with accurate answers and no strange bugs.

For Tao Jia, closed-source models such as OpenAI are inconvenient to call APIs and are not suitable for B-end users like him to customize themselvesOpen source models such as llama can be used, but their Chinese capabilities are mediocre. Therefore, under the condition that QWEN-14B can already achieve more than 70% accuracy, Tao Jia is full of expectations for QWEN-72B.

That expectation is becoming a reality. On December 8, HuggingFace announced the latest open-source large model rankings. The list includes hundreds of open source large models around the world, and the test dimensions cover reading comprehension, logical reasoning, mathematical calculation, etc., and Tongyi Qianwen surpasses llama2 and other domestic and foreign open source large models to the top of the list.

Whether from an individual, an organization, or an industry perspective, open source is conducive to the formation of a more open ecosystem. This not only allows more researchers or developers to enrich applications and services, but also promotes the continuous optimization of large models and keeps moving forward.

There are also problems under the wave of large models.

The "2023-2024 Chinese AI Computing Power Development Assessment Report" released by research institute IDC mentioned that Chinese enterprises recognize the value brought by AIGC (generative AI) in accelerating decision-making, improving efficiency, and optimizing user and employee experience, and 67% of Chinese enterprises have begun to explore the application opportunities of generative AI in enterprises or have made relevant investments;At the same time, enterprises also need to face the pressure brought about by the shortage of computing and storage resources, the availability of large industry models, and the high cost of investment.

Yan Xin admitted that they did not have the resources to train the base model from scratch, so they hoped to choose a mainstream and stable model architecture to match the upstream and downstream environments while meeting the needs of the scenario, and were more concerned about whether the manufacturers behind the open source model could continue to invest in the base model and ecological construction.

Qin Xuye, co-founder and CEO of Future Speed, has a similar view on this. Qin Xuye said that the open source model is safe, controllable, customizable, and more cost-effective, and the inference cost may only be one-fiftieth of that of the closed-source charging model. The Xinference platform launched by Future Speed is based on the Tongyi Qianwen open source model and has a built-in distributed inference framework to help enterprise users easily deploy and manage models on computing clusters.

After simple fine-tuning, the open-source large model can meet the needs of many B-end scenarios. Most of the users contacted by Qin Xuye's company use small-size models, such as QWEN-7B, and use scenarios such as external knowledge bases to do Q&A applications, recall data through large models, and put them in context to summarize and give answers.

In other words, the "full-size" open-source model provided by Tongyi Qianwen can allow the large model to reach more users. Although the model itself is open source, enterprises can still provide various forms of services on this basis, including custom development, technical support, etc. This not only brings more commercialization possibilities for itself, but also for upstream and downstream enterprises, and is a positive cycle from ecology to business, and then from business to ecology.

At the Apsara Conference in November 2023, Cai Chongxin, chairman of the board of directors of Alibaba, said that there is no ecology without openness, and there is no future without ecology, and only by standing on the basis of more advanced and stable technical capabilities can we have greater confidence in openness.

Alibaba has always had a tradition of technology openness, and has independent open source projects in the fields of operating systems, cloud native, databases, and big data. At this point, the logic of Tongyi Qianwen's open source is clearer - it is not only inheritance, but also providing more technical products through open source, so as to drive the long-term development of Alibaba Cloud.

You must know that both cloud and AI are inseparable from computing power, and large models have higher requirements for computing power. Alibaba Cloud, which already has full-stack AI capabilities, is making full use of its data, computing power, storage and other resources to attract more users to the Alibaba Cloud system with an open source model. Just like Microsoft, it is also expanding the MaaS (Model as a Service) open source model, relying on connecting all ends of the industry chain to form a large-scale and platform-based ecosystem.

According to Heike Finance, at the same time as the official announcement of QWEN-72B open source, Alibaba Cloud also held the first "Tongyi Qianwen AI Challenge", where participants can play the Tongyi Qianwen open source "Family Bucket" including QWEN-72B for free.

The competition is divided into two parts: the algorithm focuses on the fine-tuning training of the Tongyi Qianwen large model, hoping to explore the upper limit of the open source model's ability through high-quality dataAgent encourages developers to develop a new generation of AI applications based on the Tongyi Qianwen model and the Agent-Builder framework of the Moda community, and promotes the application of large models in all walks of life. The organizer provided free cloud computing power worth 500,000 yuan and generous bonuses.

The competition also demonstrates Alibaba Cloud's determination to be based on open source. This means that Tongyi Qianwen and even Alibaba Cloud are promoting the ecological prosperity of AI with diversified and all-round technical services, broadening their own boundaries while promoting the development of the entire industry.

Related Pages