Driven by AI large models, the wind of customized chips is gradually rising

Mondo Technology Updated on 2024-01-28

Recently, technology giants such as Amazon, Microsoft, Meta, and Google have increased their investment in self-developed chips, hoping to reduce their dependence on NVIDIA. It is worth noting that driven by applications such as artificial intelligence and autonomous driving, most tech giants choose to customize chips to meet their own needs. The importance of customized chips is becoming prominent.

The trend of custom AI chips is rising

Driven by the AI model boom, Nvidia, a dominant company, is forcing more and more tech giants to the point of making AI chips themselves. On November 28, Amazon Web Services (AWS) announced the launch of Trainium2, a second-generation AI chip designed for training AI systems, and a general-purpose GR**ITon4 processor at the 2023 Re:Invent Global Conference. According to Amazon Web Services CEO Adam Selipsky, Trainium2 is four times more powerful than the first-generation Trainium and twice as energy efficient as its predecessor. This equates to 650 teraflops (one trillion floating-point operations per second) of computing power per chip. A cluster of 100,000 trainium chips can train a large language model with 300 billion parameters in a matter of weeks.

At the Ignite developer conference held on November 16, Microsoft also announced the launch of two self-developed chips, MAIA100 and COBALT100. MAIA 100 is used to accelerate AI computing tasks, helping AI systems to perform tasks such as recognizing speech and images faster. The Cobalt 100 integrates 128 compute cores. Both chips are produced using TSMC's 5nm and are expected to be used in soft data centers early next year.

In addition to Amazon, Microsoft, Meta, Google, Tesla and other major customers of Nvidia are investing more resources in the research and development of AI chips this year, and even OpenAI has begun to prepare chip projects. As more and more companies enter the field of large models, the demand for high-end GPUs such as A100 and H100 has increased sharply, and the trend of technology giants investing in customized AI chips has also intensified.

Pursue chip performance and cost

The shortage of high-end GPUs is one of the reasons why tech giants have stepped up their efforts to develop AI large-scale model chips. As more and more enterprises enter the field of large models, more and more large models are released, resulting in a sharp increase in the demand for high-end GPUs such as A100 and H100 in the market. SAM Altman, CEO of OpenAI, has repeatedly complained about the shortage of computing power. According to a previous report by Barron's, the delivery of Nvidia's high-end GPUs has been scheduled until 2024. In order to reduce their reliance on NVIDIA GPUs, capable companies have ramped up chip development efforts to create, train, and iterate on large model products.

So, why are Amazon, Microsoft, etc. all moving towards the road of independent development of custom chips? One of the primary reasons is that major manufacturers want to optimize chip performance and seek differentiated solutions. In the context of the slowdown of Moore's Law, the previous path of relying on Moore's Law to drive performance efficiency is becoming increasingly unsustainable, and the best computing performance must rely on architectures for specific applications and data collections. Especially in the field of AI large models, different vendors have different differentiated needs, and more and more companies are finding that one-size-fits-all solutions can no longer meet their computing needs.

Mohamed Aad, senior vice president and general manager of Arm's infrastructure business unit, said that hyperscale cloud service providers such as Alibaba, AWS, and Microsoft have begun to develop their own chips, and the main purpose is to maximize the performance and efficiency of each chip and achieve the best optimization. They are personalized around servers, racks, and even their own data centers based on their use cases, workloads, and even their own data centers. As technologies such as GPTS evolve, the amount of data and computation will only increase. Through chip customization, manufacturers can optimize to support the ever-increasing amount of data and computing.

Reducing costs may also be a realistic consideration for the major giants. If ChatGPT's query size grows to one-tenth of Google's search, it will initially need about $48 billion worth of GPUs and about $16 billion a year of chips to stay afloat, according to an analysis by Bernstein analyst Stacyrasgon. In the face of high operating costs, self-developed customized chips have become the unanimous choice of major technology manufacturers. Some analysts said that compared with the use of Nvidia's products, Microsoft developed a chip codenamed Athena for the processing of large models, which is expected to reduce the cost of each chip by 1 3.

The future extends from the cloud to the edge

Mohamed Awad believes that in the future, more and more manufacturers will adopt customized chip solutions in the infrastructure field. Traditional server systems mostly use an architecture model in which a single CPU is connected to multiple accelerators through a standard bus. But in the age of AI, such an architecture has struggled to keep up with the growing demand for data and compute because it can't get enough memory bandwidth. For this reason, more and more model manufacturers have begun to choose custom chips in order to flexibly adjust the chip architecture and rebuild the system.

In fact, custom chips are no stranger to major technology manufacturers. Amazon Web Services began designing custom AI chips in 2018, launched its self-developed AI inference chip, Inferentia, and launched Inferentia 2, an iteration of Inferentia, in 2023, which triples computing performance. A few days ago, Amazon Web Services released the training chip trainium2. The previous generation, Trainium, was launched at the end of 2020. Google's history of custom chips is even earlier. In 2020, Google has actually deployed the AI chip TPU V4 in its data centers. At present, Google has transferred the engineering team responsible for AI chips to Google Cloud, aiming to improve Google Cloud's ability to develop AI chips.

When it comes to the future development of the custom chip market, relevant experts pointed out that with the promotion of popular applications such as AI large models and automobiles, the customized chip market will be further expanded, and at present, automobile manufacturers such as Tesla have invested in the development and commercial use of customized chips. In the future, customized chips will extend from cloud computing and HPC to edge computing. While these applications can be handled by general-purpose chips, chips tailored to specific jobs can be optimized for performance or functionality at better cost and power efficiency. Experts also said that this trend is not very beneficial to general chip manufacturers. However, for other manufacturers in the IC industry chain, such as EDA manufacturers, IP manufacturers, and wafer foundries, it is a good thing.

Related Pages