After a full year of large scale model hurricane, what key points should be paid attention to in 202

Mondo Social Updated on 2024-01-30

2023 is coming to an end, and the popularity of AI large models has not diminished in the slightest.

As soon as Google announced that it would provide developers with a new version of the Gemini model and promised to reduce the cost of use, Microsoft launched a new language model Phi-2 with 2.7 billion parameters. When the head giants moved frequently, waist players began to huddle, such as Baiao Geometry and Zhipu AI began to jointly build a multimodal large model of natural language-life language.

Although giants and other giants have deployed large-scale model technology as early as around 2019, 2023 can indeed be regarded as the "first year of large-scale models", and almost all leading technology manufacturers are deeply involved in research and development, and hot money continues to pour in, pushing the "thousand-model war" to a new climax. However, in addition to the "arms race" of large models, there are more and more cold thinking in the industry: there are more and more basic large models, why are there few that can achieve industrialization?In 2024, the productization, industrialization, and commercialization of AI technology will be the top priority in the development of large models.

*From Unsplash).

In terms of the scale of participating enterprises, the number of large models and the market size, China has become the world's second largest model industry center after the United States.

As the "flag bearer" of the domestic large model, Robin Li mentioned a set of data at the Xili Lake Forum last month: as of October this year, there are as many as 238 large models released in China, a full three times more than in June, and the large text generation model available on the Hugging Face platform is close to 30,000. Proportionally, the United States and China have launched more than 80% of the world's large models under development, beating other countries or regions.

According to the estimation of Sutu.com, the market size of China's large model will be about 14.7 billion yuan in 2023, doubling year-on-year, and it is expected to exceed 100 billion yuan in 2028. The huge market scale, the giants attach great importance to it, and the capital continues to increase investmentAI is of great significance to improving production efficiency and economic quality, and to a certain extent, it is related to the core competitiveness of the country, so it has also been highly valued by relevant departments. It can be said that the large model has been soaring for a whole year, which is inseparable from the support of policies, the attention of giants and the enthusiasm of capital.

China is at the forefront of the world in the orderly development of large model technology, and the "Interim Measures for the Management of Generative Artificial Intelligence Services" jointly issued by seven ministries and commissions including the Cyberspace Administration of China, and the "Several Measures for Promoting the Innovation and Development of General Artificial Intelligence in Beijing (2023-2025) (Draft for Comments)" issued by the Beijing Science and Technology Commission have been released one after another, clearing obstacles for the development of large models, providing necessary resources and avoiding disorderly development of technology.

In terms of capital, major manufacturers such as Alibaba, Tencent, Byte, iFLYTEK, Meituan, JD.com, and NetEase are all deploying large-scale model technology, while powerful start-ups have become the fragrant buns that VCs are competing for, and hot money is constantly pouring in. According to a report by the China New Generation Artificial Intelligence Development Strategy Research Institute, as of the end of October, there had been 38 large-scale model investment and financing events in China, and there were more than 2,200 existing AI companies.

*From Shell Finance).

At the technical level, domestic basic large models such as Wenxin large model, Ali Tongyi, iFLYTEK Xinghuo, and Zhipu are in the forefront of the evaluation on many lists, and to a certain extent, they have been able to compete with GPT PK.

The large-scale model industry is thriving, but there are still some hidden worries - such as the industrialization problem that plagues most practitioners. Any cutting-edge technology must be transformed into a product or application in order to be used by people and play a value. At present, while the AI model is catching up with the basic technology, it needs to go deep into the industrial scene and play a role in the production and operation of enterprises or the life Xi of users. In fact, the latter is the strength of China's AI industry: compared to playing chess, drawing, or composing poetry, Chinese tech practitioners are more down-to-earth, and are good at applying technology to scenarios and making it use for products, applications, or services — even if it's not so cool.

There are many difficulties in the industrialization of large models, such as the uneven degree of digitalization in different industries, and the application needs and costs of AI application by enterprises of different sizes and fields are significantly different. Because of this, although many enterprises are currently focusing on large model technology, there are very few companies that can truly use large model technology to transform their business or even build AI native applications.

However, we can also see some benchmark cases of large-scale model technology and industrial integration.

1. Du Xiaoman Xuanyuan model: the first open source financial model in China

The data-driven financial industry is an industry with a high degree of digitalization, and digital infrastructure such as databases, storage, servers, automation, and information security are the first to be applied and popularized in the financial industry. In the process of popularizing AI technology, the financial industry has long been actively exploring the combination of AI with customer service, risk control, credit granting, marketing and other scenarios to reduce costs and increase efficiency while improving customer experience.

In 2023, large-scale model technology will explode. Du Xiaoman, a pioneer platform of financial technology, took the lead in open-sourcing China's first 100-billion-level Chinese financial model "Xuanyuan" in MayIn September, "Xuanyuan 70b" was open source and open to be used freely. As an industry model born from the financial scene, Xuanyuan has a strong pertinence in intelligent capabilities, functional services and information security.

This pertinence is reflected in many aspects: for example, the dataset used by Xuanyuan for training contains a large number of institutional research reports, professional terms, ** data and other financial industry data, which gives it strong financial information understanding and processing capabilities.

In terms of technical strength, the Xuanyuan model is not far behind. He has passed the Certified Public Accountant Examination, Banking ** Insurance ** Professional Qualification, Financial Planner, Economist and other authoritative examinations in the financial field. In the C-EVAL large language model evaluation list jointly released by Tsinghua University, Shanghai Jiaotong University and the University of Edinburgh, and the CMMLU list jointly launched by Microsoft Research Asia, MBZUAI and Shanghai Jiaotong University, Xuanyuan has achieved the first place among all open source models in China. C-Eval and CMMLU are currently the two most authoritative professional lists, and they can win the first place at the same time, which is definitely a good result for an industry model like Xuanyuan.

The Xuanyuan model is being deeply applied in the financial scene.

Internally, the Xuanyuan model has been deeply empowered in scenarios such as marketing, customer service, risk control, office and R&D, and has achieved initial results. In terms of assistants, the adoption rate of large models can reach 42%, which helps the company's overall R&D efficiency increase by 20%.In the field of customer service, the large model has improved service efficiency by 25%. In the field of smart office, the current intent recognition accuracy rate of large models has reached 97%.

Du Xiaoman has always attached great importance to the export of financial technology capabilities. Xu Dongliang, CTO of Du Xiaoman, revealed that when Xuanyuan was open source in May, hundreds of financial institutions issued trial applications. Judging from the feedback from enterprise customers, the professional ability of the Xuanyuan model is well-known, 2The length of the context dialogue in version 0 has been increased to 8k, which can also provide professional explanations for in-depth problems in the financial industry such as "non-interest income growth trend".

2. Ali Tongyi Qianwen large model, implementing the "AI-driven" strategy in the e-commerce industry.

In 2023, Alibaba will have many major changes, and "user-first, AI-driven" has become the new strategic direction. When the Tongyi Qianwen model was released on April 11, Daniel Zhang, then chairman of Alibaba Group and CEO of Alibaba Cloud Intelligence Group, said, "All software is worth upgrading with a large model, and all Alibaba products will be connected to Tongyi Qianwen." ”

Ali has indeed done what it says, and the e-commerce business, which is Alibaba's base camp, has long been fully AI. Based on the Tongyi Qianwen model, Taotian Group has launched a series of AI tools for both B and C.

B-side tools include official customer service robots, intelligent generation, marketing and delivery self-monitoring, etc., and merchants have called back-end AI tools more than 1.5 billion times during this year's Double 11 promotionFor the C-side, the AI intelligent assistant ** was launched, and the number of people invited to try it exceeded 5 million in the two months after it was launched. B-end tools can improve merchants' operating efficiency and reduce traffic costs, while C-end functions can significantly improve user experience and form differentiated competitiveness in the e-commerce industry.

The combination of large models and e-commerce scenarios, Ali has gone the fastest and farthest, and Ma Yun even mentioned the refreshing concept of "AI e-commerce" in Ali's intranet reply.

In order to further strengthen the technical strength of large models and deepen the integration of AI and business, Taotian Group has recently been revealed to have secretly set up a new AI team, recruited top AI talents with high profile and high salaries, and seized the time to train the exclusive large model "Turing" for the e-commerce industry. According to the news previously revealed by Taotian Group, more AI tools will be released to merchants in the next year, including AI store opening, business consulting, intelligent weekly reports, etc., and the scope of services covers all aspects of merchants' daily operations. Under the promotion of Alibaba, the combination of large models and the e-commerce industry has just begun. It is foreseeable that in 2024, the head e-commerce platforms will increase the size of "large-scale e-commerce".

3. iFLYTEK Spark model: a benchmark player of large model + education.

The first tag of iFLYTEK is voice intelligence, and the second tag is the intelligent education technology giant. Before the advent of large-scale model technology, iFLYTEK had been working on AI technology for many years, and a considerable part of its revenue came from intelligent education services, such as oral language evaluation, educational hardware and other intelligent educational services.

After the outbreak of large-scale model technology, the combination of Xinghuo large-scale model and the education industry is even more vigorous. In May this year, iFLYTEK Xinghuo cognitive model 1The day after the release of version 0, the A-share education technology sector was launched, in addition to iFLYTEK, Xueda Education, Action Education, and Guoxin Culture all followed the daily limit, showing a trend of "sparks burning the prairies".

From 10 to 30, iFLYTEK Xinghuo large model has been focusing on overcoming advanced capabilities and multimodal capabilities, and based on technological breakthroughs, it has developed more functions and applications for schools, educational enterprises, teachers and student groups. For example, the information management of students and teachers in the school management process, the review function of leaving the school, the teaching courseware production assistant tailored for teachers, and the AI one-to-one heuristic dialogue function for students. At the same time, iFLYTEK is also deeply applying large-scale model technology in its educational hardware such as translation pens, voice recorders, learning Xi machines, and office books, strengthening product strength and consolidating its advantages in this category.

*From the official website of iFLYTEK Xinghuo).

Finance, e-commerce and education, the leading players in the three industries, can all obtain new growth points under the transformation of large models, which shows that the industrialization of large models is not a dream, but an inevitable trend.

Du Xiaoman, Alibaba, and iFLYTEK have just made a good start, and there is still a lot of room for improvement in the industrialization of large models, especially in agriculture, manufacturing, logistics and shipping, energy and other industries with a long history and low degree of digitalization, and it is urgent to embrace large model technology to improve production efficiency and realize the leap from digitalization to intelligence. In view of this, accelerating the productization, industrialization, and commercialization of AI technology will be the number one task of the large model industry in 2024. Whoever can take the lead in running through the industrialization landing path will have the last laugh in the "Thousand Model War". So, what enlightenment have the benchmark players brought to the industrialization of large models?

First, it is not important to select training parameters and design functional services in a targeted manner.

There are already many basic models, but what is lacking in the market is the top basic models that can compete with or even surpass GPT, as well as the "industrial models" that can make thousands of industries lower cost, lower threshold, and faster application. In order to make a strong industrial model, it is necessary to "understand both AI technology and industry experts".

Du Xiaoman is a good example, on the one hand, he has the AI technology foundation he relies on, and on the other hand, he has the industrial cognition, capabilities, scenarios, ecology and other resources accumulated by the financial technology industry for many years.

It is reported that although Xuanyuan is trained based on the bloom large model with 176 billion parameters, it is also inseparable from the hundreds of billions of tokens Chinese pre-trained datasets accumulated by Du Xiaoman over the years, including the basic knowledge and huge parameters of banking, insurance, ** and other industries. Because of the latter, the Xuanyuan model has the financial information processing capability that far exceeds that of similar competing products and general models, and can also provide targeted functional services for the pain point scenarios of the financial industry.

Second, we should "customize" large-scale model functional services in depth to meet the needs of the industry, rather than making them behind closed doors.

Technology companies are prone to the problem of "holding a hammer to find a nail", and if it can't meet the real needs, no matter how powerful the technology is, it may just be self-congratulatory.

Why can Du Xiaoman, Ali and iFLYTEK taste the sweetness of large-scale model industrialization first?Because Ali itself is the leader of the e-commerce industry, Du Xiaoman has been deeply involved in the construction of the domestic technology and financial industry since its establishment, and iFLYTEK has also been deeply involved in the intelligent education industry for more than ten years, and their understanding of the corresponding industry is beyond the reach of ordinary enterprises. By understanding the operating logic and deep-seated problems of the industry, we can gain insight into the real pain points of enterprises and practitioners, and give effective solutions for travel.

Taking Du Xiaoman as an example, based on the four basic capabilities of understanding, generation, logic and memory, the Xuanyuan large model integrates the use Xi and optimization needs of the financial industry, and provides a series of targeted functions. For example, for personal credit management services, the Xuanyuan model provides bank customers with the functions of customer historical information management and multi-level user demand analysis, and provides users with natural language interactive Q&A services for professional questions, so as to fully improve the processing efficiency of both parties. Du Xiaoman has insight into many needs when serving financial institutions and its own customers, so that he can make truly usable, useful and easy-to-use financial model products.

Third, the large model is not a one-man show, and it must benefit industry participants.

Small and medium-sized enterprises are the main force of the industrial chain, but because of the limited financial strength and human resources, it is often difficult to apply new technologies in the first time, especially new technologies with high thresholds. Compared with deep learning Xi, large models require huge amounts of computing power, huge amounts of data, and huge algorithms, and the threshold is much higher, which is somewhat unattainable for many enterprises. This is an opportunity for the leading players, if we adhere to the inclusive and open route, we can not only let the large model technology have an "industrialization" of the landing point, but also can obtain the corresponding value in the industrialization of the large model.

At the large model technology and application forum jointly organized by Du Xiaoman and Guanghua School of Management of Peking University, Xu Dongliang, CTO of Du Xiaoman, expressed a similar view, he believes that large models are an opportunity for small and medium-sized financial institutions to break through, because they can accelerate the process of digital and intelligent upgrading through application innovation, and then cross the digital divide.

It is not difficult to find that "openness" has become the greatest common divisor of the large model of the successful landing industry. Du Xiaoman's Xuanyuan, Ali's Tongyi Qianwen, and iFLYTEK's Xinghuo are all open source and open route. As Xu Dongliang said, opening up the capabilities of large models to financial institutions can not only accelerate the popularization of technology, but also lower the threshold for use, which is an inevitable choice to achieve technology inclusion.

Unlike the short-lived flash in the pan of emerging technologies such as blockchain, the popularity of large models will not drop suddenly. On the one hand, large-scale model technology will go deep into more industries in 2024, and the C-side, large-scale model-driven explosive phenomenal applications will definitely appear, and the B-side, large-scale model industrialization will only increase more and more. On the other hand, the essence of large model technology is the continuation of deep learning Xi technology. AI technology has been developed for more than 10 years and will remain the foundation of the technology industry for decades to come. Large models are the biggest wave on the AI wave, and the AI wave will continue to surge.

Related Pages