Sora bombed the film and television industry, and the outlet of ordinary people came

Mondo Entertainment Updated on 2024-02-21

Text: Zinc scale, Meng Huiyuan.

Edited by Li Wenjie.

Four films such as "Let's Shake the Sun Together" and "Mr. Red Carpet" announced their withdrawal, allowing the Spring Festival file to open the "first year of withdrawal", and then the artificial intelligence (AI) giant OpenAI released the first Wensheng ** model SORA, ushering in the "GPT moment" of AI**. This Spring Festival of the Year of the Dragon played the "Song of Ice and Fire" that belongs exclusively to film and television practitioners.

Through the ** generated by SORA, you can see the protagonist and background characters, all of which show strong consistency, can support 60 seconds of one shot to the end, and contains highly detailed backgrounds, multi-angle shots, and emotional ...... of multiple charactersThat is, just enter a text description, maybe every ordinary person who uses SORA technology can become a "big director" like Jia Ling without going into battle in person.

The last time it was able to detonate the attention of the Internet so quickly was ChatGPT launched in November 2022, and now, ChatGPT has not only led the vigorous development of global large models, but also begun to show its strong productivity in text and other related industries. As the "king bomb" at the beginning of this year, Sora is naturally widely expected by the outside world to take over the mantle of ChatGPT and rewrite the development of industries such as **.

However, the current 60-second generation ** can't support a movie, and if you really want to realize your "director's dream", you have to let the technology "fly a little longer".

The "60 seconds" that shocked the world "OpenAI released the Wensheng ** model SORA, and AI ** entered the eve of large-scale application. ”

From the perspective of relevant beneficiary segments, the downstream application side includes but is not limited to beautification, advertising and marketing, short dramas, games, office software, etc. ”

SORA has three outstanding highlights and achieved a milestone in the field of AIGC. ”

Multimodal models such as AI generation are expected to play a greater role in IP development such as film and television, animation, games, and **, continue to promote IP development to reduce costs and increase efficiency, bring incremental monetization space, and stimulate the demand for computing power. ”

Since its release in the early morning of the 16th, more than 14 brokerages have released more than 19 related research reports in just a few days, all of which have given high praise to SORA.

Also amazed by the emergence of SORA are many celebrities in the technology circle.

In response to a SORA demo by netizen "Bev Jessos" on the social platform, with the text claiming "GG Pixar", Musk said below this tweet that "GG Humans" (GG is one of the terms used in online games, originally referring to players greeting each other at the end of the game, and later extended to "game over"), and praised it, "In the next few years, humans will create excellent works with the power of AI." ”

Zhou Hongyi, chairman of 360 Company, posted on Moments, "Once AI can connect to the camera and understand all the movies in the world, its ability to understand the world will far exceed the level that can be achieved through text learning alone." In this context, the realization of artificial general intelligence is no longer a distant dream. ”

Jia Yangqing, former vice president of Alibaba and founder of Lepton AI, directly commented that SORA is "really good", he said, "The advent of SORA may bring a wave of opportunities for OpenAI's company to be acquired by FOMO (an acquisition caused by fear of missing an opportunity)." ”

The key question is, why is SORA widely regarded as a leading technology in the film and television industry?

In fact, before the emergence of SORA, there were similar AI models: Google released a new ** generative model VideoPoet on December 21 last year, which can perform operations including text to, image to, stylization and so on; EMU Video published by Meta, which is able to generate ** clips based on text and image input; Runway's Gen2 has a Motion Brush function, which only needs to be swiped anywhere in the image to make the still objects in the image move; Stable AI launched Stable Video Diffusion, which can automatically generate high-quality ** clips based on images; Pika, the popular Wensheng ** software that became popular overnight, has set off an application boom of AI**.

But as OpenAI's technical report puts it, "Sora is able to deeply understand the physical world in motion, which can be called a real world model".

The advantage of SORA over the above AI models is that it can accurately represent details, understand the presence of objects in the physical world, and generate characters with rich emotions, and even generate based on prompts, still images, and even fill in missing frames in existing **.

The Beijing News's actual test comparison results show that under the same prompt word, Pika can only generate 3 seconds of **, Gen-2Video can generate 4 seconds of **, and SORA can generate ** for up to 1 minute. In terms of content, it is difficult for both Pika and Gen-2Video to always maintain the coherence of the same character, and Sora not only embodies all the details in the prompt words, but also maintains the coherence of the characters well, making the ** almost "fake and real".

The new king ascended the throne, who was about to see that the momentum of Sora's "driving high and crazy" was so obvious that its competitors couldn't sit still.

Before the advent of SORA, the default choice for AI generation was Runway, especially since the launch of the second-generation model Gen-2 in November last year, which not only solved the problem of low coherence between each frame in the first generation of AI generation, but also gave good results in the process of generating ** from images, so it was also called "Midjourney in the AI world".

But after Sora's release, Runway's CEO, Cristóbal Valenzuela, only gave a brief announcement on platform X: "Game On." ”

Domestic enterprises that develop and deploy multi-modal large models related to AI have also never stopped catching up with cutting-edge technologies.

According to incomplete statistics, more than 10 A-share listed companies, including Wanxing Technology, Bohui Technology, Danghong Technology, Yidiantianxia, Digital Video, Hanwang Technology, Shensi Electronics, Dongfang Guoxin, Insai Group, Tors, Guomai Culture, and Jiadu Technology, have disclosed their business in the field of generation models on the interactive platform in the past three months.

Among them, eClick said on the investor interactive platform on February 4 that the company's AIGC creation platform Kreadoai can help enterprises realize the full-link closed-loop of content production from script writing, voice cloning, personalized digital human selection to output oral broadcasting.

Wondershare Technology said on the interactive platform on February 2 that its ** creative product Filmora can be used for all kinds of ** creation and editing, and the "Sky Canopy" large model is a multi-** large model with ** creative AI technology as the core, covering audio, images, ** and other multi-modal capabilities.

Danghong Technology said on the interactive platform on January 5 that the company has a self-developed AIGC toolset, released a solution to generate three-dimensional volume with static, and achieved up to 800 times visual lossless compression through point cloud model conversion and compression algorithms to achieve mutual switching between different modalities.

What's more, since the second half of 2023, many resources invested by domestic technology giants in multimodal AI have made substantial progress, such as Alibaba's Animate Anyone and ByteDance's Magic Animate, both of which are landing applications of **to** technology. It can be seen that thanks to the continuous development of global generative AI technology, not only enterprises from the field of generative models are actively "preparing for war", including the iterative update of Wensheng diagram, ** and other applications, but also expected to bring "revolutionary" development opportunities to more related industries in the long term - from a global perspective, the computing power industry chain from the upstream hardware, midstream server switches, and downstream application side of the closed loop is now becoming clearer, from the cloud side to the device side, from the hardware to the software are showing a vibrant scene.

This also means that the entire radiation range from the core manufacturers of the global computing power industry chain, to the end-side AI-related enterprises, to the localized computing power companies (including AI server parts, server machines, computing power leasing, data centers, etc.), is taking the emergence of SORA as an opportunity to open its own explosive update, and lay a solid technical foundation for realizing the "director dream" of ordinary people.

In the next Spring Festival, maybe the day when everyone can be a director is approaching. As one netizen said, users' expectations are always faster than the pace of technology landing.

Although the longest time in the ** released by SORA is only one minute, industry insiders**, according to the iteration speed of OpenAI, it is not far to produce dozens of minutes of AI**, "In the next few years, it will bring a subversive impact on the entire film and television production and short ** industry, and the highlight moment of the metaverse will be closer and closer." ”

However, while the ** content generated by SORA exploded on the Internet, many people also found the shortcomings: although it performed well in **image quality, details, light and shadow and color, etc., it was still slightly inferior in terms of camera movement angles and finer content control, such as a one-minute Tokyo street girl strolling scene, the girl walked with deformed legs, confusion when the legs were crossed and transposed.

In response, the CEO of Perplexity AI said, "SORA, while amazing, is not ready for accurate modeling of physics. And the author of Sora is very witty, mentioning this in the technical report section of the blog, such as broken glass is not well modeled. ”

In response to the immaturity of SORA, OpenAI also admitted and is actively improving, and said that it will continue to work hard to improve SORA's performance and accuracy, in order to bring more innovation and breakthroughs to the film and television industry in the future.

In fact, based on the technical characteristics that SORA has shown at present, many film and television practitioners believe that to be applied to the production of the film and television industry, AI generation technology must at least be able to adjust the details at any time, and the generated ** has a certain stability, and there can be no changes, obviously SORA's current fineness does not meet such requirements, but it is enough to use it for early development (especially concept design), and even based on the current high cost of manual production, If SORA's technology iteration can reach the stage of commercial application in the film and television industry in the future, its development space is also predictable.

It seems that SORA will only have to wait for a while to meet the expectations of users, but it should be noted that from the actual situation, the implementation of AI ** technology is still full of uncertainties, especially from the complexity of the technology to ethical and copyright issues.

Tang Linyao, an associate researcher at the Institute of Law of the Chinese Academy of Social Sciences, believes that the challenges posed by AI generation** include, but are not limited to, how to effectively distinguish between real and false content, and how to ensure that AI works are not used to mislead the public or other illegal purposes; A further challenge for the rule of law lies in how to balance the tension between strong regulation and industry development.

The industry is still looking for answers and solutions to the misuse of generative technologies and the transparency and explainability of AI models. And now, all we can do is let the technology "fly a little longer".

Related Pages