From Li Yizhou s point of view, the difference between shell , backdoor and embezzlement .

Mondo Technology Updated on 2024-02-23

Text|Shidao 01 It seems that doing AI is really not profitable There is a saying: to see whether an industry can make money, it depends on how many courses are sold.

When the industry is everywhere, everyone is burying their heads in gold panning, who would not think of letting others grab jobs. Only when "there is no roll" will it be relied on selling poor information to cut a wave of leeks.

This is the case in the domestic AI industry now: the threshold for technology development is high, the cost is large, the front line is long, and investors cannot be attracted by storytelling.

If you don't want to do this, you can find another way - sell classes.

199 yuan AI course, 250,000 copies sold a year, revenue of 50 million yuan, this is not more profitable than engaging in development?

But the world is a huge matryoshka doll, and the "misappropriation" of "misappropriation" is also making money. On a certain treasure, Li Yizhou's full course costs 0You can buy it for 26 yuan, and at the same time, it also gives away a full set of AI courses from another "giant" Mr. Crane, and the monthly sales are quite impressive.

Why is AI training so profitable? In other words, why is popular technical training so profitable?

When the desire to make a lot of money and the fear of being eliminated go hand in hand, coupled with tempting gimmicks such as "Dr. Tsinghua University", "Xiaobai can learn" and "full service", supplemented by the low price of "199". Anyone who listens to it is moved.

Selling a class is "storytelling". Which technology is hot, there is a story. Investors are tired of hearing stories, but it is very fresh to listen to "leeks".

In 2021, "Metaverse First Lecture" earned 1.6 million in 10 days;

In 2023, "Everyone's Artificial Intelligence Lesson" will have a revenue of 50 million a year.

All slogans are summed up in one sentence: xxx is the last wave of opportunities for ordinary people to change their fate. "xxx" can be the metaverse, blockchain, artificial intelligence, or any hot technology in the future.

The next one is to teach enterprises how to go to sea (dog head to save life).

02 In addition to selling courses, we return to the second shortcut of AI to make money, of course, in Li Yizhou's case, it is more accurate to say "backdoor".

In fact, there is no clear definition of "casing".

The "kernel" of a large model refers to the core architecture and algorithms. If you dig deeper, you could even say that almost all of the models are "shelling" Google Deepmind's Transformer and its three variant architectures.

In fact, the "casing" is not without technical content. Let's start with the principle of research and development of large models.

First, self-developed large models need to be pre-trained and fine-tuned.

Pre-training, developers "feed knowledge" to the large model, do not intervene in it, let it "chew books" and learn on its own, and then get a basic ability, that is, the base model.

Fine-tuning, also known as "fine-tuning", allows developers to "problem-solve" a large model and intervene to adapt it to the needs of a specific task.

At this time, if you add a specific industry-specific dataset to the base model for further fine-tuning, a fine-tuning model will be obtained, or an industry model or vertical model.

The "casing" based on multiple scenarios takes place in the above development process.

For example, in the pre-training stage, there will be a "data shell", and it was previously revealed that because of the poor quality of Chinese data, some domestic teams used the data output by ChatGPT for training.

In the tuning phase, there will also be "fine-tuning shells", but there are many excellent products, such as perplexity. This AI Q&A engine fine-tuned GPT-35. The performance is significantly improved, but the cost is lower and faster than GPT-4. The week before, Perplexity closed a Series B funding round with a valuation of 5$200 million.

However, the above "casings" have a considerable part of the technical content. And the "shell" we usually talk about most often refers to the simple and crude open source or chatgpt, the technical threshold is quite low, and it can also be called "backdoor".

At present, many "backdoor" ChatGPT in China charge users. Although this behavior has been suspected of infringement, it is just because OpenAI has no ** business in China, so it cannot be sued.

However, this does not mean that the "backdoor" can act recklessly.

According to the Interim Measures for the Administration of Generative AI Services issued in July 2023, the CAC shall notify the relevant institutions to take technical measures and other necessary measures to deal with the provision of generative AI services outside of China that does not comply with the provisions of laws, administrative regulations, and these Measures. Among them, where a crime is constituted, criminal responsibility is pursued in accordance with law.

However, the Measures also relax regulatory requirements in a number of places and add measures to encourage the development of AI technology.

For example, it is clear that the "filing system" for generative AI implementation algorithms is expected to be a review and approval system.

For example, the Measures stipulate that these Measures shall apply to the use of generative artificial intelligence technology to provide the public within the territory of the People's Republic of China with services that generate text, audio, and other content. Article 2 of the April Consultation Paper provides services to domestic users, whether they are "developing" generative AI or "using" generative AI.

We can speculate that "R&D" and "utilization" draw a model of the difference in the scope of application of the law of "backdoor" and "fine-tuning". In the current domestic AI startup scenario, it may not work if it is not "shelled" at all. However, the higher the technical threshold for developers and the higher the "R&D" content, the less likely it is to reach the bottom line of supervision.

03 "Triple Narrative" or Suspected Fraud Finally, back to the Li Yizhou incident, you will find that this guy is actually playing "Triple Narrative".

At first, he said that he was a "self-developed" model, but his platform was a "backdoor" combination, and when you thought he was just "backdoor", under the mask was "theft".

From the current point of view, many models in China are derived from the stable diffusion of the open source model "fine-tuned" by developers.

According to the City Boundary, an AI drawing model developer named "Rabbit Raccoon" found that 3 models he had fine-tuned were "stolen" by Yizhou Intelligence, and two of the models were signed by him with the liblib platform. Zhong Zhong, the founder of Alchemy Technology, Na Usjia, an independent developer, and others found that the models they released on other platforms appeared on the homepage of Yizhou Intelligence's official website.

The most immediate impact for developers is that they don't get any benefit from the model products they make. The training time of the "rabbit raccoon" model is as long as half a year, the rental computing power costs 20,000 yuan, and the purchase of NVIDIA graphics cards costs 80,000 yuan.

At present, there are a total of 97 models, except for 1 model that is the official model of Stable Diffusion XL, the rest are third-party models made by domestic or foreign developers.

This makes people want to ask, is it not afraid that the defendant will be deceived by the two heads of fraud if he handles the domestic developer model so openly, and also calls the "self-developed" model to the students?

However, some lawyers said that it is unclear whether it constitutes "fraud", because the meaning of AI-related content and courses is relatively broad, and as long as the user provides the agreed relevant information after payment, it cannot be said to be a fraud. Whether or not these materials meet the user's expectations is a matter of contract performance.

But what is certain is that Li Yizhou's AI class is not only suspected of violating the "Advertising Law" and the "Anti-Unfair Competition Law", constituting an act of unfair competition of false publicity, but also suspected of violating the "Consumer Rights and Interests Protection**" and infringing on consumers' right to know and fair trade.

At present, liblibai has issued a statement, saying that it is closely concerned about this incident and may take legal measures to investigate the infringement liability and defend the rights of the infringed creators.

According to reports, the number of developers who are willing to speak together has reached 17, corresponding to 25 models. Although the amount of individual claims is not large, if a lawyer comes to engage in a class action lawsuit, the outcome is difficult to predict.

References: The first "god" of the SORA seal collapsed", Dong Wenshu Dong Yuqing "Large Model Shell Disenchantment: Questioning the Shell, Understanding the Shell", Jiazi Lightyear Zhao Jian "Li Yizhou Gets Caught in Two Hardcore Controversies Again: Yizhou Intelligence Has Not Been Filed for Suspected Illegal Operation, Misappropriating AI Drawing Models", Whip Bull Whip Bull Shi "Tsinghua Doctor Internet Celebrity "AI Class Sold 50 Million"? AI Course Chaos Investigation", The Paper, Fan Jialai, Shi Ruotong, Guo Sihang.

Related Pages