The New York Times sued OpenAI over where the copyright boundaries for large model training lie

Mondo International Updated on 2024-01-31

The question of whether and how to pay for the knowledge collection of the AI platform has been pushed to the foreground again.

The full text is 1993 words, and it takes about 4 minutes to read

Written by Ding Mo (**person) Edited by He Rui Proofread by Chen Diyan.

Data map of the New York Times headquarters building. Photo: Xinhua News Agency.

OpenAI has been sued by the New York Times!

According to CCTV Finance and Economics, on December 27, local time, the New York Times, an American newspaper giant, filed a copyright infringement lawsuit against OpenAI, the most well-known AI (artificial intelligence) platform in the United States, and its investor Microsoft, accusing the two institutions of collecting millions of articles from the newspaper without permission to train artificial intelligence. This is also the first case in the world in which an AI platform has been sued for copyright infringement by a large-scale **.

The New York Times said that OpenAI and Microsoft's illegal intellectual data harvesting and dissemination had harmed the paper's ability to obtain subscriptions, copyright licensing, advertising and other incidental revenue, causing billions of dollars in damages, and while the amount of the claim was not disclosed, it explicitly demanded the destruction of any related AI models and training data.

Not long ago, Apple, another well-known American company that is preparing to get involved in AI platforms, announced that in order to ensure the legitimacy of its business, it has reached an agreement with a number of media groups such as NBC (National Broadcasting Corporation) to obtain the authorization to collect the intellectual content of the latter's newspapers, television and other publications in the form of payment, with an amount of up to 50 million US dollars.

These two events, one positive and one negative, will undoubtedly have a far-reaching impact on the relationship between technology companies and the traditional press and publishing industry in the future. Will everyone be happy, or will it open up a new front for the competition between the two sides for the network economy that has lasted for many years?It remains to be seen. However, these two events have also pushed the question of whether and how to pay for the knowledge collection of the AI platform to the foreground again.

It is foreseeable that there may be more and more polarizing options. For example, some non-commercial and small news and publishing organizations have chosen to cooperate with AI platformsLarge media conglomerates, such as the New York Times, have taken steps to block AI's knowledge data collection. This seems to be a repetition of history when the Internet portal was first born.

OpenAI's logo taken on February 3, 2023. Photo: Xinhua News Agency.

In China, AI platforms are also in the midst of a storm. Up to now, there is no news of cooperation between domestic AI platforms and ** institutions, and there is no case of going to court. However, we should also take precautions and start coordinating potential contradictions such as whether and how to pay for the knowledge collection of AI platforms as soon as possible.

As we all know, all AI platforms, especially knowledge-based AI platforms, do not operate without roots, nor do they generate intelligence out of thin air, but by collecting existing knowledge and creative data, analyzing and summarizing through the corresponding model-based "learning" and "training" processes, and generating meaningful text or graphic output.

And these existing knowledge and creativity are the common wealth of the whole society as a whole;From an individual point of view, it is the result of civilization condensed by the painstaking efforts of countless creators in ancient and modern times, both at home and abroad. AI platforms use technological advantages to absorb and utilize this knowledge in a short period of time, and if they are all free to use and take them without telling them, it is obviously suspected of being greedy for heaven.

Between the two, the essence is a question of the distribution of benefits, but unfair distribution of benefits will hinder development.

The strong rise of AI platforms will inevitably reshape the knowledge dissemination system. As a knowledge creator, whether it is an individual, a team or an enterprise, if it can obtain a better distribution of benefits in the new system, AI is a positive driving force for the development of the knowledge economy.

On the other hand, if knowledge creators are deprived of the rewards they deserve because of the strength of AI platforms, knowledge becomes less and less valuable, and the result is certainly not in the overall interest of the public. How do I make the allocation?Is it passive like OpenAI, or is it proactive like Apple?This is a real problem that we are currently facing.

Infographic of Apple's retail stores. Photo: Xinhua News Agency.

The pros and cons should be predicted as early as possible. For example, although the centralization of knowledge dissemination channels may cause a weak position for knowledge creators, on the other hand, as long as there is support at the level of laws and regulations, as long as there is support at the level of laws and regulations, the cost of rights protection will decrease, and the actual benefits of rights protection will also increase.

This means that, as far as China's current situation is concerned, as long as the existing copyright laws and regulations are amended, the types of digital copyright that prohibit "taking without suing" are added, and the intellectual content published on the Internet is protected by legal and technical means, the AI platform will not be able to exploit the loopholes.

In this case, the AI platform will take the initiative to reach an agreement with the copyright owner for its own development, either pay directly, or pay indirectly by sharing the revenue, or provide added value by attributing the author. The bigger the AI platform, the greater the revenue for the copyright owner, achieving a win-win situation.

In this regard, there is also a lot to be done in the means of tax regulation and control. On the one hand, policies support the development and growth of AI platformsOn the other hand, part of the tax revenue of the AI industry will be used to support the development of the main industries of knowledge creation, such as ** institutions, in the form of tax cuts and subsidies, so that the two can promote each other.

Without sharing and co-prosperity, there will be no lasting symbiosis, and the consequence of a single family is the premature decline of the entire ecosystem. Therefore, prohibiting the collection of knowledge from AI platforms is not only not a strict requirement for emerging industries, but should also be regarded as encouraging the overall development of relevant enterprises in a fair, orderly and healthy environment, and avoiding the wild growth under the law of the jungle.

To this end, AI platform companies will pay more attention to the efficient use of knowledge and the intelligent development of models in design and operation, and provide users with services more accurately and efficiently. At the same time, knowledge creators such as ** institutions are incentivized and will take the initiative to use AI platforms to share knowledge, forming a virtuous circle to jointly promote greater economic and knowledge development, and better benefit human society.

Related Pages