Google is coming with Gemini, claiming to surpass GPT 4 and human experts

Mondo Technology Updated on 2024-01-29

On December 6, Google suddenly announced the launch of the "largest, strongest, and most versatile" native multimodal large model Gemini10, challenging the GPT-4, a large model of rival Open AI, which allegedly outperformed the GPT-4 model and human experts in a series of intelligence tests.

We are getting closer to the vision of a new generation of AI models. After a series of demonstrations, Eli Collins, vice president of product at Google Deepmind, told Yicai, including CBN, that this is Google's most powerful and versatile model to date.

1. GeminiMultimodal capability or hyper-GPT-4, which is expected to further expand the application scenarios

As a multi-modal model, Gemini can recognize and understand text, image, audio, and five kinds of information at the same time, and the understanding of information is very accurate. Gemini is available in three versions: Gemini Ultra for highly complex tasks, Gemini Pro, the best model for a wide range of tasks, and Gemini Nano for device-side devices. Gemini Ultra is the first large model to outperform a human expert on the MMLU task, achieving a 9004% of the results. For comparison, the human expert scored 898% and 86 for GPT-44%。Judging from the evaluation data,geminiThe performance of GPT-4V is comprehensively surpassed in multimodal tasks, or further expand the application scenarios of multimodal large models. Previously, OpenAI released GPT-4 Turbo in November, and opened GPTS, followed by Google released GEMINI, the competition between overseas technology giants in multi-modal large models is becoming increasingly fierce, and at the same time driving the ability of the underlying multi-modal large model to continue to break through, superimposed GPTS and other AI application forms, AI applications are expected to usher in a period of rapid growth.

2. Multimodal AIProspects for industrial development

In the context of continuous breakthroughs in AI technology, the application and development of multimodal models have shown an unprecedented momentum. As more companies and institutions devote themselves to this space, we can expect to see more innovative and groundbreaking results. At the same time, the application of multimodal models will also have a wide and far-reaching impact on enterprises and consumers, and promote the further development of artificial intelligence technology. With the intensification of competition among overseas technology giants in the field of multi-modal large models, the capabilities of the underlying multi-modal large models are also constantly improving, and the vertical application field is booming, and AI applications are expected to usher in a rapid explosion.

According to the data, the global artificial intelligence (AI) market size is estimated to be 1197$800 million, which is expected to reach $1,597.1 billion by 2030, is growing at a CAGR of 381%。The North American AI market was valued at 1475$800 million.

North America in 2022years to gain the largest market share。The higher demand for automated and technologically advanced hardware and software products across various end-use verticals and favorable policies to encourage the adoption of AI in the North American industry have greatly contributed to the growth of the AI market. In 2019, the United States** launched a U.S. initiative to propel the U.S. as a leader in AI technology. The program focuses on the adoption of AI-based systems by providing guidance for the practical application of AI technologies in various industries and fields. North America is the birthplace of leading tech giants such as Facebook, Amazon, Google, IBM, Microsoft, Apple, etc., which have made significant contributions to the growth of the North American AI market.

Asia-Pacific is expected to be the fastest-growing AI market。Increasing investments in the adoption of AI by various organizations are driving the demand for AI technology. China-based tech giants have struck deals with investors to divest financial services groups that provide consumer credit, wealth management and other business-related services. Moreover, the increasing adoption of AI in various industries, such as automotive, healthcare, retail, and food and beverage, is driving the growth of the AI market in the Asia-Pacific region.

3. AIMarket pattern: multimodality is the main direction

After OpenAI announced that Chat GPT will achieve multi-modal updates such as networking and support**, voice communication, and text conversion, major domestic and foreign manufacturers continue to deploy AI models across text, image, and audio**, and industry applications are also constantly upgraded.

At present, domestic and foreign manufacturers are still focusing on multi-modal large models and developing competing products against GPT-4. AI start-up Anthropic has developed a benchmark AI chatbot, Claude. Google is investing in Anthropic and is also developing its own language model, Palm2, and chatbot Bard. Google has multiple cross-modal AI models and provides a number of functional service modules.

Meta has taken another path and open-sourced its own large model Llama, and later more and more companies have open-sourced their own large models, including Vicuna, Wizardlm, Guanaco and other modelsMicrosoft's KOSMOS-1 model has 1.6 billion parameters, unlocking multimodal capabilities. Overseas large models are accelerating iteration, and multimodality is the main direction: domestic large models are blooming, and versions and performance continue to iterate.

With the fierce competition of AI large models in foreign countries, many domestic Internet companies and technology companies have also begun to independently develop large models, such as Alibaba, iFLYTEK, Baichuan, etc., compared with foreign countries, the version and performance of domestic large models are updated and iterated faster. Publish Wenxin Yiyan and continue to iterate;Tencent's hybrid model ushered in a new upgrade, and officially opened the "Wensheng Map" function;Pangu provides customers with a series of basic models with 10 billion, 38 billion, 71 billion, and 100 billion parameters, which can match the diversified needs of customers in different scenarios, different time delays, and different response speeds. Many domestic manufacturers also have a layout in the upstream and downstream of the multi-modal industry chain, including Suzhou Keda, Neta Software, Danghong Technology, Jingyeda, Shengxun Co., Ltd., Weiyi Jiahe, Insai Group, Bohui Technology, Digital Government Communication, Dahua Co., Ltd., Yuncong Technology, Zhongke Chuangda, Tors, New World, Hengsheng Electronics, EclickWorld, 360, Jiadu Technology, Jebsen Co., Ltd., Kunlun Wanwei, iFLYTEK, Wanxing Technology, Tom Cat, Chinese**, Digital Political Communication, etc.

Fourth, China's multimodal AIThe development status of relevant listed companies

According to the statistics of Southern Fortune Network, there are currently 27 A-share multimodal AI-related listed companies, and the overall operating income in 2022 is about 11097.8 billion yuan, a year-on-year decrease of -318%。In 2022, the overall net profit attributable to the parent company of the 27 A-share multimodal AI-related listed companies was 270.4 billion yuan, a year-on-year decrease of 7751%, down from 2021 levels. From the perspective of the profit margin of the multimodal AI industry, the average gross profit margin in 2022 is 4872%, down from 2021, with an average net profit margin of 1159%, down from 2021. Judging from the regional distribution of listed companiesThe 27 A-share multimodal AI-related listed companies are located in Beijing, Zhejiang, Guangdong, Shanghai, Jiangsu and other provinces and cities, and are mainly concentrated in East China.

5. Summary

With the landing of Gemini, Google wants to consolidate the advantages of "the strong and the strong" in the field of large models. For OpenAI's GPT and Meta's Llama, Gemini has shown a leading edge in terms of model size, training data, optimization strategies, etc., which undoubtedly brings pressure and challenges. At the same time, domestic manufacturers such as Tencent, and Alibaba are also actively investing in the research and development of large models, and continue to innovate in the underlying technology. In the future, more and more high-quality large models will enter the "deep water area" of generative AI, which will promote the investment and technical wrestling of major manufacturers in technology research and development, and lead the industry to usher in a virtuous cycle of development.

Related Pages