At a time when the competition of large models in China is becoming more and more fierce and difficult to land, what is special about the model of Zhipu AI?And what different thinking can it bring to China's large model that is sought after by capital?
Author |Fighting
Produced by |Industrialist
For Zhipu AI, for a long time, the term "stars holding the moon" is very appropriate.
Some time ago, Zhipu AI's latest financing once again attracted widespread attention and became the focus of attention. According to public information, the new round of financing has exceeded 2.5 billion yuan, and with the previous rounds of financing, the market value of Zhipu AI has exceeded 10 billion.
What is more noteworthy is the luxury lineup of investors, including social security **Zhongguancun Independent Innovation** (Legend Capital is the ** manager), Meituan, Ant, Alibaba, Tencent, Xiaomi, Kingsoft, Shunwei, BOSS Zhipin, Good Future, Sequoia, Hillhouse and many other institutions, as well as some old shareholders including Legend Capital.
In this "100 model war", Zhipu AI is undoubtedly the one that is expected by the public.
However,It is worth noting that at present, there is only a 6B version of ChatGPT3 that Zhipu AI can commercialize, which is benchmarked against GPT 35. There is still a distance between commercial high-parameter versions. Especially after Ali officially open-sourced the 72b parameter model, Zhipu will also face a lot of pressure.
Some questions worth thinking about are, what are the advantages of Zhipu AI?Where is the imagination for future development?And how to solve some of the problems it is currently facing?Dig into the other side of its frequent fundraising.
1. What is the valuation of 10 billion yuan?
From the first generation of open source in March to the third generation after 7 monthsThe development of Zhipu AI is very rapid
In the latest release of the third generation of basic large language model chatglm3 series. Officially, the performance of this model has been greatly improved compared with the previous generation, and it is the strongest basic model below 10B.
Specifically, according to the MMLU sorting, in the comparison of models of all sizes, the score of chatglm3-6b ranks 9th, but the smallest of the first 8 models is also qwen-14b with a scale of 14 billion parameters, and if sorted according to gsm8k, chatglm3-6b-base even ranks third,Exceeds GPT-35 of 571 point.
It can be seen that it is not groundless for Zhipu AI to catch up with OpenAI.
If you want to dig deeper into the advantages of Zhipu AI, you have to start with the many problems in the development and implementation of domestic large models.
The value of a new technology is geometric, and commercialization is the most direct way to test. Among the large model manufacturers in China, it can be said that most of them are still in the stage of technology and development. For commercialization, it is basically in an exploratory stage.
Zhipu AI has served the B-end long before it was founded, and currently has more than 1,000 customers. It can be seen that its industrial landing and commercialization are more promising.
Another extremely important premise for the implementation of large models is data security. As the only fully domestic-funded and domestically developed large-scale model enterprise in China, Zhipu AI has launched the GLM domestic chip adaptation plan, which provides different levels of certification and testing for different types of users and different types of chips, which can truly achieve security and controllability.
In a sense, this advantage can completely capture the central state-owned enterprises and large enterprises with special requirements. "State-owned enterprises and central enterprises, if they want to do model capabilities or access, Zhipu is an unavoidable option no matter what. An industry insider said to the industrialist.
In addition, there is the human element. In the primary market, early-stage investment is investment in people, and this applies to all startups. The "predecessor" of Zhipu AI is Tsinghua KEG (Knowledge Engineering Laboratory), and CEO Zhang Peng graduated from Tsinghua University with a Ph.D. in Computer ScienceChairman Liu Debing studied under Academician Gao Wen and was the deputy director of the Science and Technology Big Data Research Center of Tsinghua Institute of Data SciencePresident Wang Shaolan is a leading doctor of Tsinghua University's innovation.
On the whole, Zhipu AI has the right conditions such as landing experience, complete talents, sufficient funds, and good technology. This condition also makes it the first to stand out in the race of large model manufacturers. However, this is only the appearance.
In terms of path selection, unlike the more mainstream GPT, Zhipu AI uses GLM, and Zhipu AI proposes a new GLM (General Language Model) path. The training efficiency is higher than GPT, and it can also understand more complex scenarios.
At the level of large-scale model landing, it did not choose to launch a large-scale industry model, butConvince industry customers to make fine-tuning on the base of the general large model. In CEO Zhang Peng's view, only a general large model of a certain scale can realize the emergence of human-like cognitive ability.
In addition, in order to improve the performance and ability of large language models as AI agents,Tsinghua University and Zhipu AI have launched a new solution, AgentTuning, which can effectively enhance the ability of open-source large language models to act as AI agents.
The reason why Zhipu AI has won the favor of capital and Internet giants is not only because of its technology, but also because of its choice of path, mode, and strategy, as well as the clarity of the underlying positioning of its own large model.
In the words of CEO Zhang Peng, the full line of products of Zhipu AI has been benchmarked against OpenAI's products.
So, for now, in addition to the verified paths and models, does Zhipu AI have any other puzzle pieces to be completed?
2. Commercialization, AI open source and unavoidable funds
Judging from the model version of the commercial authorization of Zhipu AI. It is currently limited to 6b, which is 6 billion parameters. From the perspective of OpenAI's open-source model, GPT-3 is an autoregressive language model with 175 billion parameters, and OpenAI has partially open-sourced itgpt-3.5 has 137.5 billion parameters, some of which have also been open-sourced.
More notably,Ali has also recently open-sourced the 72b parameter model. You must know that the current large-scale model applications are mostly in the stage of vigorously producing miracles, and larger parameters mean better landing results.
It can be found that although Zhipu AI, as the first open-source model in China, has a strong technical architectureHowever, there is still some distance between the scale of the model of OpenAI and the commercial authorization of major domestic manufacturers. And with the release of Alibaba's open-source model with larger parameters, the advantage of Zhipu AI in the 6B model may become weaker.
If you want to make up for this shortcoming, you need a lot of financial support.
"If there is also a financier like Microsoft behind Zhipu AI, it will be very eye-catching. An industry insider said bluntly to the industrialist.
In fact, as the capabilities of the AI large model continue to improve, the training parameters naturally need to be improved, and the demand for computing power and storage will also increase. This will be a huge problem in terms of funding and resource scheduling.
Roughly speaking, the annual cost of privatizing and deploying a large model with a scale of 130b is close to 40 million, but it is unknown how much value these 40 million can bring when spent. In terms of the deployment of large AI models, at present, small enterprises have weak payment capabilities, and large enterprises are either self-developed or still in the stage of understanding and cognition, and it is difficult to commercialize them.
Where the money will come from is a question that needs to be solved urgently.
"Part of the reason for the open source 6b model is to tell the market, I have a better one, depending on whether you are willing to spend money. ”An industry insider said to the industrialist. For Zhipu AI, open source 6B shows its strength and pulls investment is a more obvious solution.
Another solution is to expand the "circle of friends".
As we all know, Internet giants have great advantages in computing, storage capacity and data resources. For Zhipu AI, these require it to invest a lot of money to build. Cooperation with giants can greatly reduce R&D costs and improve R&D efficiency. In addition, Zhipu AI can also use the market position and channels of cloud vendors to promote its own AI technologies and services.
On the other hand, because large models need to be deployed on the cloud and paid according to data runs, the more users use the model and resources, the greater the demand for cloud computing power, and the revenue of cloud manufacturers will increase. In addition, cloud vendors can use the technical strength of Zhipu AI to enhance their competitiveness in the field of artificial intelligence.
In general, for cloud vendors, they can drive their own cloud revenue;For large-scale model manufacturers, it can reduce the investment in infrastructure, which can be described as killing two birds with one stone.
At present, Zhipu AI has launched a series of cooperation with Alibaba, Tencent, Meituan and other enterprises.
From this point of view, the reason why Zhipu AI "stars hold the moon" lies in its open and integrated business model, which can promote the landing of large models and accelerate the development of large model ecology at a time when domestic large model competition is becoming more and more competitive and difficult to land.
This model of Zhipu AI has also brought some new imagination and thinking to itself and the future development of domestic large models.
Third, the future of domestic large-scale models is in the first place
"If the model can fire half of the people, the company will consider using it. In the communication with a person in the industry, he expressed his view on the long way to commercialize the current large model.
From an objective point of view, the current domestic large-scale model format belongs to a hundred flowers blooming and has begun to appear homogeneous. This will not only lead to the irrational use of computing power and other infrastructure, but also cause unhealthy competition.
At present, the process of large-scale model implementation is slow, and the large-scale model entrepreneurship boom that is still springing up will inevitably produce a large number of bubbles. For domestic large-scale model manufacturers, it is undoubtedly the best option to promote the commercialization of large-scale models with the power of ecology.
In fact, there is no generational difference between mainstream large models at home and abroad at the algorithm levelHowever, there is a gap in computing power and data.
By vigorously supporting domestic leading technology enterprises in the general field to develop independent and controllable domestic large models, and at the same time encouraging all vertical fields to use open source tools to build standardized and controllable independent tool chains on the basis of large models, we can gradually build a good ecology of interactive symbiosis and iterative evolution of basic large models and professional small models.
As the large-scale model ecology becomes more and more perfect, it will also bring some new changes.
The first is the improvement of model quality. With the advancement of technology and the investment of resources, future large models will have higher accuracy, stronger understanding and wider applicability. This not only means that they are able to understand natural language better, but they are also able to perform more complex tasks such as translation, reasoning, creation, etc.
The second is a richer range of applications. In addition to traditional text processing, large models will also play a greater role in speech recognition, image generation, comprehension and recommender systems. This means that we can enjoy the convenience brought by AI in more scenarios.
In addition,In the future, the large model will be more customizedIt can better meet the individual needs of users. Users can select the appropriate model according to their actual needs and customize the configuration. This will give users more flexibility to leverage large models to solve their own problems.
In the large model ecosystem, data will become more shared and open. Institutions and enterprises may strengthen cooperation and share high-quality data resources, thereby promoting the development of large model technology. This cooperation will provide a broader space for the development and application of large models.
The new wave of science and technology will inevitably require some enterprises to undertake some missions. Focusing on the present, technical architecture is an important criterion for large models to come outLooking to the future, if you want to stand on the wave of AI large models, the ecological construction force is becoming more and more important.