OpenAI will open the king bomb in the new year, and the AI competition will be upgraded

Mondo Technology Updated on 2024-02-19

** Bohufn

Author |Chen Ping'an.

Open AI is swiping again.

At the beginning of 2023, Open AI detonated the global technology circle.

Because it has released a brand new chatbot model - ChatGPT. Compared with those simple chatbots in the past, ChatGPT can not only answer human questions, admit mistakes in their answers, refuse to answer inappropriate questions, but also write poems, programming, writing ** and so on.

Even Musk couldn't help but sigh: "ChatGPT is too good to be scary, and we are not far from the dangerously powerful artificial intelligence." Just two months after its launch, ChatGPT reached 100 million monthly active users, bringing $10 billion from Microsoft's parent company, OpenAI.

At the beginning of 2024, a similar story is playing out again.

In the early morning of February 16, OpenAI threw a "king bomb" in the field of AI generation** and announced the launch of a new generative AI model "SORA".

In the demo** shared on OpenAI's official website, SORA can directly output footage with multiple characters, multiple scenes, and camera movements. This is a world of difference from the AI-generated ** a year ago, and its ** length "crushes" its peers.

It is understood that through text instructions, SORA can directly output up to 60 seconds, and the degree of refinement is far beyond most people's imagination.

This means that, following text and images, OpenAI has expanded its advanced AI technology to the best field.

Musk also commented on the OpenAI model again: "AI-enhanced humans will create the best works in the years to come." ”

On the one hand, there is a technological breakthrough.

SORA is capable of generating ultra-long times of up to 1 minute**, far exceeding the 18 seconds of Runway-Gen2 and the 3 seconds of Pika.

What's more, compared with the obvious "AI sense" of the previous AI, SORA's production has raised the entire AI** industry to the next level in terms of realism and picture refinement.

Judging from the official website**, the black moles on the characters' faces, the neon lights and shadows reflected in the water on the ground, etc., the fineness of their details has almost been realized. In other words, the quality of SORA's creations, both in terms of high definition and restoration, is remarkable.

On the other hand, no less interesting is Sora's ability to understand long texts. OpenAI wrote in its official blog, "Sora not only understands the needs of users, but also knows how these things exist in the real world. ”

What does that mean? Just enter a piece of text and Sora will automatically generate up to one minute of HD**. What is amazing is that Sora can not only accurately grasp the complex meaning in the user's text, but also can separate different elements and transform them into ** content with specific creative ideas, which looks like the work of professional director, camera and editor.

For example, in Sora's theme of "a coral reef world filled with colorful fish and marine life, carefully constructed by paper art", Sora successfully advanced the story through its camera angles and shooting timing. There are actually multiple camera transitions that it doesn't specifically instruct it to do, but it does it automatically.

According to insiders, for example, the ** generated by SORA takes several days to complete even the head animation production company, while SORA only takes a few minutes to complete.

Guosheng believes that SORA has crossed over to practical productivity tools compared with other previous Wensheng models, and the length of 1 minute is expected to be applied on a large scale in the field of short, and the ability to expand is also expected to produce long, which may bring a new round of industrial revolution in content creation.

Of course, Sora isn't perfect. OpenAI's official website points out that it may be difficult to accurately simulate the physics of complex scenes, and may not be able to understand cause and effect, confusing the spatial details of the prompt.

Taking the demo released this time "Celebrating the Lunar New Year with Chinese Dragons" as an example, Sora could not accurately generate the Chinese in the ** screen, and was ridiculed by netizens "It's too difficult to blame Chinese?" "And the birthday cake candle of the elderly, but there is no change in front of the candle flame and so on.

However, the team of OpenAI has allowed AI to progress from the initial blurry and indistinguishable images to the current stage where it is enough to grow and grow, which shows that its development is terrifying.

A netizen at station b commented:

Before Sora came out, I was still putting a question mark in my heart for GPT5,How much can I improve?,Can OpenAI continue to lead?,But now I'm really convinced.,It's really more than a little better than other factories.,It's a dimensionality reduction blow as soon as it comes out.,SORA is also a transform architecture.,Isn't this part of GPT5?,Before the Internet said GPT5 After reading all the ** of the Internet**I don't believe it.,Now I believe it.。

The most direct impact of SORA is definitely the impact on the industry. As a **generation tool, SORA can generate a 60-second fine ** with only text, which greatly reduces the threshold and cost of production, especially for hot content with strong timeliness.

However, the deeper significance of SORA is that it also means that the competition for AI has escalated again.

In 2023, the release of ChatGPT will lead the world into an AI boom, with Chinese companies alone releasing more than 130 large models. In the beginning, everyone's goal was to move towards self-developed large models - wealthy companies developed pedestal models, while startups turned to develop industry models and vertical models that were fine-tuned with specific datasets on the basis of open-source models.

But it turns out that the real threshold for large models lies in the high cost - massive computing power, data service providers that can provide customized services, and top talent teams.

Taking computing power as an example, most of the computing chips used to train large models in the current market come from NVIDIA. According to financial reports, in 2023, the price of Nvidia A100 will increase by about 1 time. The power cost of deploying 1,000 servers alone is as high as 200,000 yuan per month.

The advantage of large manufacturers is that they not only have strong financial resources to purchase and deploy GPUs on a large scale, but also can use large models to improve efficiency through engineering optimization in a timely manner.

According to a report by LatePost, in November, based on Alibaba's "Tongyi Qianwen", the cross-border business AI business team officially announced its own product "AIDe", which has a series of functions such as translation, marketing, design, and localization services. Statistics show that in November, the number of overseas inquiries received by AI-optimized products increased by 15% compared with before.

After the launch of the lark model, Byte has successively developed products such as bean bags, buttons, and stoves. Taking the hearth as an example, users can work with AI agents to chat and create interactive experiences through stories.

Sora uses a Transformer architecture that represents ** and images as a collection of smaller data units called patches, similar to tokens in GPT. Importantly, like GPT, it conforms to the AI scaling law, which means that the sample quality improves significantly as the amount of training computation increases.

Some industry insiders said that at present, SORA is not open to the public for the time being, and only OpenAI CEO Sam Altman interacted with the comments on the X platform**, believing that the limitation of computing power may be an important factor in the current SORA that is not open for use.

Computing power has become one of the most concerned resources. In 2018, Altman personally invested in Rain Neuromorphics, an AI chip startup, and in 2019, OpenAI spent $51 million on Rain's chips; Last November, Altman sought billions of dollars in funding for a chip company code-named "Tigris."

Masayoshi Son, the founder of SoftBank Group, is looking to raise $100 billion to set up a chip company that would complement the business of its semiconductor design company Arm.

However, Sora's amazement doesn't mean that others don't have a chance. **The star companies before the generation track are Runway and PIKA, although many people think that SORA is easy to reduce the dimensionality of the two, but PIKA founder Guo Wenjing said in response to Titanium**, "We think this is very exciting news, we are already preparing to directly rush and will directly benchmark SORA." ”

In fact, OpenAI is not without its rivals. Also released at the same time as Sora was Google's gemini1The 5 Pro, according to official data, supports up to 1 million tokens, far exceeding other current base models, and can process large amounts of information at once, such as 1 hour of audio, more than 30,000 lines**, or more than 700,000 words.

SORA is of course a strong proof of OpenAI's leadership, but it is more like a signal for the escalation of competition in the "miracle of power" track of large models.

The copyright of the first image and accompanying pictures on the cover of the article belongs to the copyright owner. If the copyright owner believes that his work is not suitable for everyone to browse or should not be used for free, please contact us in time, and the platform will correct it immediately.

Related Pages