On December 16th, Robin Li, Founder, Chairman and CEO, and Peng Zhang, Founder & President of Geek Park, discussed "AI Native, what kind of products and developers do we need?".The topic started a conversation and output a lot of ideas, and this article presents the best part. At the same time, a collection of 17 views is attached at the end of the article.
Q:Recently, Google released Gemini, which is said to be OpenAI under GPT45 is also going to be sent, and there have been various rumors recently. We are very concerned, indeed as you said that there are only a few of the most advanced basic models, from a domestic point of view, and the distance from foreign countries, have we narrowed or widened in the past period of time?How do we evaluate catching up with them, and even have the opportunity to create our own different value in the future?
Robin Li:The market environment of China and the United States is quite different, especially at the application level, and the development direction of China and the United States may be relatively different. The United States has always had a large market for enterprise-level software, and it should be said that it is more advanced, and China will be more advanced in the field of to C.
If we go back to the technology gap between China and the United States, my point has always been that technology still needs to serve applications. What exactly do you mean when you say that the technology is good, or that technology is bad?Today, our library PPT generation ability,I think it's the best in the world.,It's based on the Wenxin Yiyan model.,At this time, you say how big a gap we have with the world's leading level.,I don't think there's a gap.。 But if you talk about enterprise applications in particular, the market in China is too small. So we're not prepared to do anything to optimize these things, and it's really possible to be lagging behind. Or all kinds of small languages, we don't have the energy to do those optimizations, so it is also possible that they are lagging behind. So when I look at the technical level of the model, I am looking at its application, where do you compare, simply brush the list, to run a score, I think this thing is quite boring. If you train a big model, how many resources do you have to invest!10,000 GPU cards, you have to train for a long time to train out, you say I take a college entrance examination, how many points do I get, you get a high score in the college entrance examination, how much can you earn?You brush a list, you rank first, the owner of that list, how much bonus can he give you?
Robin Li:I think the advanced technology of the model depends more on what application scenario the model is in and what it does. After thinking about this thing clearly, you can judge the quality of the model. Sometimes I talk about the evaluation of the quality of the model, which is the core competitiveness of a company that makes a model. If you don't know, you have to rely on a third party to give you an evaluation and give you a score, this thing is unreliable, you don't know what you're doing
Q:Recently, there is a saying, the large model itself is in the technology of "vigorously out of the miracle", it seems that this thing, as if we want to do the innovation of the large model, but also rely on the miracle of the force, the card can be re-betted, enough determination and enough money to go up, this distance can be narrowed through resolute efforts?And how to understand this powerful miracle?Isn't it okay to rely on it?
Robin Li:I think the phrase "miracle by force" refers more to the process of exploring the large model from 0 to 1. OpenAI used enough computing power and enough data to train when others didn't know that this road could be passed, and finally ran out of this road. In fact, they didn't invent a new algorithm, they used transformer, and in the end they made a very good effect, because it used enough cards, and I saw that some people in the American academic community were ridiculing that all the universities in the United States couldn't train a GPT3 with their cards5. It is true that the computing power used here is very large.
But I think that going further back, this game is not this way of playing, it is not the law of vigorously producing miracles, and it is more towards its opposite. What is the opposite?I think it's the same as all business competitions, whoever is more efficient wins. You can finance, I can also finance, and in the end I use 10 dollars to make a 100-point effect, and you use 10 dollars to make a 120-point effect, and over time you win. Or in order to make a 100-point effect, I use 100 dollars, you use 80 dollars, and you win. I think that's how the value of the big model is reflected in the application in the end.
Now ** and the public are paying attention to training large models, what is the cost of application inference to be applied after the real training is completed?Whether your reasoning cost is lower than others' under the same effect, or whether your effect is better than others under the same cost, this is the main line of competition in the future, and I think it is the opposite of vigorously producing miracles. It's that you really have to be able to make these things more efficiently than others.
This is also why we have a good accumulation in this area, we have a layout in the chip layer, in the framework layer, in the model layer, in the application layer, for so many years, so I can optimize end-to-end. I just talked about the example of generating PPT, when it has requirements, I pass it down, and the Wenxin model must be optimized for this thing. When optimizing, it was said that the number of calls was too large, the cost was very high, and we couldn't afford the computing power. Going down, your framework, which is the framework of the paddle, should be completely optimized for the needs of the Wenxin large model. Further down, it is how the chip adapts to the framework of the paddle and the Wenxin model. Layer by layer, end-to-end optimization, we found that we can basically reduce the cost of inference to 1% from the time of release in March to the present. You used to only dare to call 10,000 times, but now you dare to call 1 million times a day, which is a completely different feeling. I think this aspect is the main line of competition in the future.
Q:: If you look at it, if you want to be in this, what are the core problems of China's development of a new generation of AI industry relying on large models?If we were to list three core issues, which ones would you deserve the most attention at this time of the day?Is it the problem of computing power?
Robin Li:It can't be said that there are three, the most critical, or even the only one, is the application. The large model is a basic thing, and if there is a valuable application on it, then even if the industry runs through, it can become bigger and bigger. What you may want to ask is, how can you develop a good app, this key factor is in the **??I think there are several aspects: on the one hand, the industrial policy of our country. If you look at China's relatively leading industries, many times the state has foresight in industrial policy. For example, solar photovoltaic, and then power battery, until the current new energy vehicles, new energy vehicles seem to be 60% of the world's shipments or how much, why is this?If China is a fuel vehicle, it will restrict the purchase and traffic, and pay the vehicle purchase tax, so there are all kinds of tricks to suppress the demand, but there is no such restriction on new energy vehicles, so it will naturally develop better. In terms of large models, if the country can introduce relevant industrial policies to encourage the development of AI native applications based on large models, I think this is a very important success factor.
Second, the current ** environment. **Now the main focus is on the basic model and has this ability and that ability, this thing is really not important, the important thing is our existing enterprise, whether it uses the large model to have a positive effect on its business core key indicators. In other words, our existing enterprises, whether it is a large enterprise, a small or medium-sized enterprise, or a start-up company, no matter what it does, after using the large model, can it have a positive effect on its key business indicators, that is, the attention of this area is relatively low, I think if this attention can be raised, it is also a very important point for the large model to be made, or to be bigger. This thing is easy to say, but not easy to do. In fact, large companies are very slow to respond, and even I sometimes say that large companies represent backward productivity, and you must not look at what big companies are doing.
The following is a collection of Li Yan's macro points, which is worth savoring.
1. I think the difference in this wave of large-scale model technology lies in its versatility. We call it "intelligence emergence", and it has learned what has not been taught. With this feature, when you have a set of basic technologies that can do very well and are very leading, it can quickly make valuable applications in a variety of scenarios. It's something that AI hasn't done in the last 70 years, so it's a completely different opportunity.
2. For the vast majority of this year, the focus of the whole society is on the large model itself, and on the basic model. But my point of view has always been that there must be thousands, if not millions, of AI-native applications on top of the basic model for the value of the large model to be reflected. I see that the main excitement of **, society, and the public is still on the basic model, and it has not been transferred to AI native applications, which is more or less anxious. That's why I keep emphasizing that we must roll up AI native applications, and we must make this thing so that your model can be valuable.
3. The advancement of model technology depends more on what application scenarios and what the model is used forAfter thinking about this thing clearly, you can judge the quality of the model. The evaluation of the quality of the model is the core competitiveness of a company that makes a model. You know what is good and what is not good, so you can make a good model. If you don't know, it's unreliable to rely on a third party to give you an evaluation and give you a score.
4. "Vigorously produce miracles" refers more to the process of exploring the large model from 0 to 1. Further back, this game is not this way of playing, but more towards its opposite. What is the opposite?This is the same as all the laws of business competition, whoever is more efficient wins. You can finance, I can also finance, and in the end I use 10 dollars to make a 100-point effect, and you use 10 dollars to make a 120-point effect, and over time you win. Or in order to make a 100-point effect, I use 100 dollars, you use 80 dollars, and you win.
5. Is the cost of your reasoning lower than that of others under the same effect?Or at the same cost, do you perform better than others?This is the main line of competition in the future. We have layouts at the chip layer, at the framework layer, at the model layer, and at the application layer, so we can optimize end-to-end. From the release of Wenxin Yiyan in March to the present, we have basically reduced the cost of reasoning to the original 1%, and we only dared to call 10,000 times in the past, but now we dare to call 1 million times a day, which is a completely different feeling.
6. I do think that hundreds of basic models are a huge waste of social resources, especially when our computing power is still limited. More resources should be put on exploring the combination with all walks of life, and exploring the possibility of any new super apps.
7. The process of iteration of the basic model does not rely on running scores or brushing those things. What a person who does applications and businesses really cares about is the core indicators of his business, and his needs lead to the evolution and iteration of the Wenxin model in the direction that truly meets the market demand, which is a virtuous circle.
8. With the advent of the era of large models, the real value lies in native applications, and native applications are great opportunities for large manufacturers, small and medium-sized enterprises, and entrepreneurs.
9. There is only one key point in the development of large models, which is applicationThere are three key factors for developing good applications: first, there are relevant industrial policies to encourage the development of AI native applications based on large models;Second, it is the incumbent enterprise that uses the big model to have a positive effect on the core key indicators of its businessThe third is super apps, when and in what areas will they appear?More startups need to work hard and try all kinds of things.
10. Whether the biggest value of the large model is a new super application or a transformation of an existing application is inconclusive, of course, it must be the latter. If you look at Microsoft's Office365Copilot today, it costs $30 a month, and it may collect $5 billion a year, which is many times larger than the annual revenue of the entire OpenAI. In fact, it has created so much new value by transforming its existing products, so I think everyone should look more at the combination with their existing business. This wave of generative AI can create greater value for the transformation of existing businesses, Microsoft is an example, and Adobe is also an example, its embrace of large models has led to a significant increase in the revenue and profits it can generate for existing products.
11. I think that these companies that have been formed in almost all existing industries, once they turn the corner and can make good use of the ability of large models, the benefits and value gains they obtain must add up to the largest. Of course, this does not mean that startups do not have opportunities, it is very possible for startups to make three or five super apps, and hundreds or thousands of very valuable vertical apps, which is also very possible.
12. The ratio of PM and R&D is changing. In the past we had to do a lot of R&D in a PM, today it may be 1 to 1. In other words, many practices do not require R&D intervention during the early test, and the PM can do it by saving up a thing by itself, which is different from before.
13. The truly successful AI native application product manager is likely not to be a certain type of person, but a synthesis of all kinds of people, someone may not be a computer major, but he has a strong ability to learn Xi, he has a sense of product and market, and at the same time is not afraid of technology, even if he has not learned, read the latest **, you can understand what is said, what method is used, this type of person is most likely to become a successful product manager.
14. Why am I struggling with the word "access", in fact, I am struggling with your muscle memory. Because access is the easiest, but also the lowest value. What does your business have to do with the big model, and can the big model help you grow your DAU, how much your retention rate grows, how much does your user duration grow, how much does your revenue grow, and how much does your profit grow?These are the key metrics for the business. But it's not easy to make a difference in these key indicators. It doesn't depend on inertia, it doesn't depend on muscle memory.
15. Our cognition today is very different from our cognition in a month or half a year or a year. How did this cognition come out of it?It's not something you think out of thin air while sitting in the room, and it's not something I read ** to understand. Indeed, there are countless developers who know that this road leads and this road does not work in the process of trying. Today, the vast majority of possibilities have not been tried, and entrepreneurs and developers have to try, whether this road works or not, it is a valuable experience and lesson. Even if you don't get through, you know that you haven't gotten through, and if you get through, it's a big opportunity.
16. I think it is a long-term opportunity, but if you don't seize this opportunity early, you are likely to fall behind in the competition. This year there is a chance this year, next year there is a chance next year, and if there is a chance in five years, I think there will be a chance. But why not earlier?Why not bring out the value of technology and the potential of technology earlier than your peers and competitors?Especially when your metrics, when your North Star metrics, do they have a positive effect on the key metrics of the core business?When you think it through, everything else will be solved.
17. With the advent of the era of large models, the real value lies in native applications, and native applications are great opportunities for large manufacturers, small and medium-sized enterprises, and entrepreneurs.