Reporter Li Jing reports from Beijing.
On December 15, Wenge, an artificial intelligence company under the Chinese Academy of Sciences, launched Yayi 20 domestic large model (hereinafter referred to as "Yayi 2.".)0"), and publish an open source technical report.
According to Luo Yin, CEO of Zhongke Wenge, Yayi 1The 0 model was launched on June 3 of this year, and 6 months later the 2Version 0 has many breakthroughs in four aspects: model training, characteristic skills, domain applications, and evaluation indicators. First of all, in terms of model training, Yayi 20 Implemented a model from 7 billion parameters to 30 billion parametersIn terms of training datasets, about 10 tons were extracted from more than 200 tons of rich multivariate data, a total of 2A high-quality training dataset of 65 trillion tokens to meet the needs of model training.
The reporter of China Business News learned that Yayi model is an enterprise-level general model, which has previously provided professional model services in vertical fields for many units such as **, level**, research institutions, etc. Yayi 20 post-release, based on Yayi 20 Zhongke Wenge has built a number of industry model applications for security, finance, public opinion, law, traditional Chinese medicine and other fields.
Wang Lei, chairman of Zhongke Wenge, said: "Nowadays, large models are also blooming in China, but there are very few truly native domestic AI models, and there is still a big gap between manpower, talents, computing power, algorithms, and data and the international advanced level, and the domestic AI industry is still in the early stage of development." ”
From the current large model field, there are already ChatGPT, LLAM and other large models on the market, but Wang Lei believes that China still needs to do its own native training large model, mainly three answers: First, the current domestic basic native large model is extremely scarce, independent research and development capabilities are insufficient, and the ability of open source models is unstable, Chinese support is relatively weak, language support is relatively small, security is insufficient, can not be used in a strict production environment. Second, many important departments of government and enterprises need autonomous, controllable, safe and reliable native models, because the open source model is a black box, and in the pre-training stage, the quality and quality of the data are not trusted, which will lead to the model being insecure at birth. At the same time, when applied to government and enterprise scenarios, the operability of the secondary training is not strong, which restricts the application and development. Third, the large model is a large project of the integration of large computing power, big data, and large algorithms, and it is a huge project, and the next generation of technological innovation requires the accumulation of R&D experience.
In fact, the research and development of Yayi's large model has achieved a number of hard-core technical achievements. First of all, the basic model of national production, the data model is completely independently developed by our team of engineers and young scientists, and pre-trained from scratch. Secondly, very importantly, we have accumulated two very important AI datasets, one is a massive high-quality pre-trained dataset, and the other is a domain fine-tuning instruction set. Wang Lei said, "But what still needs to be seen is that in some new industry applications, multi-round dialogues, long text reading, multi-modal intelligent interaction, content security and controllability, and automatic invocation of intelligent plug-ins, these work still need to do some technical exploration." ”
Artificial intelligence is divided into general and specialized, of which general artificial intelligence is divided into three levels - low, medium, and high, and now it is undoubtedly at the low level, but it is gradually developing and evolving to the medium at this level, and the evolution trend of large models is very obvious. Director of the Economic Research Institute of Nankai University and Chief Economist of the China New Generation Artificial Intelligence Development Strategy Research Institute, said that from the perspective of the application of large models in various industries, there are two important influencing factors when landing. The first is the fault tolerance rate, which is high for internal use and low for external use, which determines the application of this model in the industry. The second is the size of the market, which will solve the head problem first and then the long tail problem when used.
Gong Weihua, CIO of Bank of Beijing, talked about the implementation of large models in the banking field, saying: "At present, large models have their own advantages and some shortcomings. Because there are many things in the large model that are unexplainable, and there is a model black box, as a bank, if the capabilities of the large model are directly used to serve customers, the risk is still very large. Therefore, in the short term, there will be fewer direct external services for large models, but internally, we are willing to conduct training and exploration in various scenarios. In the future, it is believed that with the governance of science and technology ethics, the country's laws and regulations on the application of models will gradually mature, and the application of large models will become more mature. ”
In addition, it can be clearly seen that the trillion-level track of artificial intelligence is leaping from perceptual intelligence to cognitive and decision-making intelligence, and the listing of companies with visual recognition technologies such as face recognition marks that the market for perceptual intelligence has become large. With the release of ChatGPT, the cognitive intelligence market has entered an accelerated monetization period in the past two years, and the market space for decision-making intelligence in the future is even greater. Wang Lei said.
Editor: Zhang Jingchao Proofreader: Yan Jingning).