Translator: Li Rui.
Small models, open-source models are storming GPT-4, and OpenAI needs more sophisticated measures to build a technical moat to protect its LLM business.
In May 2023, an internal document leaked by Google revealed the challenges faced by large language models (LLMs) such as ChatGPT and GPT-4. The main point of this paper is that Google Inc. and OpenAI are not building a technical "moat" for their private large language models (LLMs). The open-source model will eventually dominate the LLM market.
"While the LLMs we have developed still have some advantages in terms of quality, this advantage is shrinking at an alarming rate. Open-source models are faster, more customizable, more private, and more powerful. ”
And in less than a year, most of the warnings made in this document have been proven correct. Open-source models are rapidly catching up in quality, they are more flexible, and they are faster to train and fine-tune.
However, as the field of generative AI grows, OpenAI is taking more sophisticated steps to build a moat on the technical side to protect its LLM business. But this strategy doesn't always work.
When OpenAI released ChatGPT, the majority opinion was that LLMs would improve as the number of applications grew. With 175 billion parameters, GPT-3 requires hundreds of gigabytes of GPU memory and huge investments to train and run. Some of the open source LLMs released in 2022 are so large and clunky that few enterprises can run them.
Initially, the high cost of training and managing LLMs is a moat, and only well-funded companies have the strength to own and develop. OpenAI has used its first-mover advantage to establish itself as a leading position. The company's GPT-3 and later ChatGPT and GPT-4 have actually become the go-to model for building LLM applications.
While other big tech companies are playing catch-up and pouring money, smaller companies can only hope to buy access to these LLMs through APIs.
However, a study conducted by DeepMind researchers in 2022 showed that developers don't need to run a massive LLM to get state-of-the-art results. This study of a model called Chinchilla showed that a small model trained on a very large dataset could match the performance of a large model. With 70 billion parameters, the Chinchilla model outperformed other state-of-the-art LLMs at the time, the researchers said.
Although DeepMind did not open source Chinchilla, its training method led to a new direction of research. Meta released LLAMA in February 2023, a series of LLMs with parameters ranging from 7 billion to 65 billion. The LLAMA model accepted 14 trillion tokens of training, while GPT-3 only has 300 billion tokens.
The LLAMA model is resource-efficient and high-performing, and has been compared to ChatGPT in several key benchmarks. And LLAMA is open-source, which means that developers are able to run it directly on their servers, even on a single GPU, at a very low cost.
Following the release of the LLAMA model, DeepMind released a series of other open-source models, each of which builds and improves on the previous model. Many LLM products come with a license that allows developers to create LLMLM products using them.
Model compression, quantization, low-rank adaptation, and other technologies that have evolved over the years have made it increasingly convenient for developers and businesses to adopt open-source models in their applications. New programming frameworks, low-no-tools, and platforms have made it easier for some enterprises to customize and run LLMs on their infrastructure, and have promised innovations such as high-performance LLMs running on edge devices.
To be fair, OpenAI's LLM model still has some advantages in terms of performance, and we have not yet seen an LLM that can catch up with GPT-4. But some open-source models have reached and surpassed GPT-35 performance, and it is only a matter of time before they catch up with GPT-4 and other state-of-the-art LLMs.
The open source model will take away the technological advantage of big tech companies and commoditize LLMs. As switching costs fall, more and more businesses will be incentivized to switch from GPT-4 to a low-cost open-source model. Even though these models haven't caught up with GPT-4 in performance, most enterprises have specialized needs that can be met with finely tuned LLMs that are low-cost and can meet other needs such as data ownership and privacy.
With no infrastructure and technical moats, OpenAI needs to turn to other areas to ensure the defensibility of its business. The company has already made a number of strategic moves to build a new moat.
An important part of the company's strategy is to create network effects around its flagship product, ChatGPT. OpenAI first announced in November last year that the planned GPT Store was up and running. It is an artificial intelligence version of the Apple App Store that allows users and developers to share their customized LLM versions for others to use. While most GPTs are going to die, some of these LLMs will be very useful and able to increase productivity.
OpenAI will also offer an enterprise feature that will allow businesses that sign up for the ChatGPT team plan to have their own private GPT store.
The idea of OpenAI is that with enough critical mass, users will stick with ChatGPT, and more users will sign up for the ChatGPT Plus plan to access the GPT store. Developers will continue to use the platform to reach more users for their products. As more ChatGPT content is released, the mass use of users will also provide free publicity for the company, further making it the de facto carrier of the LLM app.
OpenAI is strengthening network effects through profitability. "In the first quarter of 2024, U.S. GPTS builders will be paid based on how much users interact with GPT," the company claimed. "This means that they will incentivize maximum user engagement to increase the stickiness of the product. But it also has the negative effect of replicating all the bad things about social**.
At the same time, OpenAI will strengthen the data network effect to continuously improve its products. If the user is on the free plan, OpenAI will collect their data to further train its models. If a user is on a ChatGPT Plus plan, their data will still be collected unless they opt out of the data collection program.
For example, OpenAI tweeted to users with this message: "Hey, you can opt out of training on the settings page, whether it's free or add-on. I'll make sure to let the team know, and clarify this on the webpage – willdepue (@willdepue) 11.01.2024".
Another important effort is to reduce the cost of running ChatGPT. OpenAI CEO Sam Altman said in a recent interview that the company has managed to reduce the operating costs of LLMs by a factor of 40. As open-source LLMs continue to catch up with ChatGPT, cutting costs will allow OpenAI to roll out more features for both free and paid users.
OpenAI is also preparing for future developments. OpenAI allegedly runs ChatGPT on its own devices, which may have been built specifically around its LLM. This will give it the power of vertical integration, much like Apple's iron-fisted grip on the iOS ecosystem. What people are seeing may be the beginning of a new paradigm shift in computing. As the field evolves and new computing paradigms emerge, OpenAI is ready to launch its vertical stack.