WeChat*** Alphabet List (ID: wujicaijing), Author: Bi Andi, Editor: Wang Jing, Title Picture: Generated by SORA On the 10th day of the announcement of SORA, peers couldn't sit still.
On February 16, the seventh day of the Chinese Lunar New Year, OpenAI unveiled a new generative AI model SORA: input natural language instructions and output up to 60 seconds**.
Text-to-** generative AI models aren't new, but Sora — like ChatGPT — has set the internet on fire with its precipitous quality gains. The high-definition picture quality, smooth graphics, and the physical trajectory that seems quite natural at first glance look like a century away from the almost ghostly "Will Smith Eats Pasta"** a year ago, and the AI products released by peers only last year are also overshadowed by them.
10 days is a short time, but it's enough to make a difference in the field of AI.
At least two "popular fried chickens" in the AI industry have come forward: on February 22, StabilityAI officially opened the public beta of Stable Video, and the model behind the product, Stable Video Diffusion, was released in November last year. Almost at the same time, Midjourney, another company that has made great achievements in the field of Wensheng Graph, also revealed that it may include ** features in the next iteration.
In the past, the "fried chicken" stood in the shadows, and it was difficult to catch up with Sora's peers.
One. In addition to the immediate impact of receiving the SORA demo**, people immediately began to make longitudinal and horizontal comparisons.
For longitudinal comparison, people took out the AI ** of "Will Smith eating pasta" a year ago. The Hollywood star's face is seriously deformed, and the shape of the face is ever-changing, triggering the uncanny valley effect every minute, and the more you look at it, the weirder it becomes.
*: Generated by AI.
A year later, Sora has been able to make a woman walking on the streets of Tokyo**, whether it is the picture quality, continuity and stability, or the woman's gait and expression, it can almost be fake, which makes people praise again and again: the progress of the text to ** is too great.
*: Generated by sora.
It's a pity that although countless netizens begged Sora for the Sora version of "Will Smith Eats Pasta" under OpenAI CEO Sam Altman's social ** account, they didn't get what they wanted. But Smith himself ran out and did the whole job, shooting live-action and "fake" a new version. Don't say it, it's really almost deceived - real people can shoot ** pretending to be AI-generated, which shows the high quality of SORA**.
Horizontal comparison, peers are a little embarrassed. OpenAI has published the text instructions for each SORA demo**, although people can't use SORA directly, but they can take the instructions to try the existing products on the market to see what the difference is.
It doesn't matter if you don't try, you'll be shocked if you try:When I saw SORA**, I felt amazing, and after watching the performance of my peers, I had a new understanding of the power of SORA.
It's the same instruction for women to walk on the streets of Tokyo, and the ** pause generated by runway is still like that, and when connected, it can only be described as "ever-changing", and the image of a woman cannot remain stable over time. You must know that in Sora's presentation**, there are also changes in camera position, from the vista to the close-up of the face, the woman's form is always the same. The ** generated by the runway is more like a dynamic fusion of many **.
Generated by runway.
StabilityAI, the company that open-sourced Stable Diffusion, produces a picture with good clarity and aesthetics, but the woman's face is badly deformed, looking like a skull, and it is full of weirdness.
Generated by stable video.
Pika is a bit tricky, with relatively smooth graphics, but it's blurry and not realistic.
Generated by pika.
In addition, commands such as the slapping mammoth, the moving car, the overlooking landscape, and the little monster watching the candle have all been compared horizontally by netizens.
Two. Suddenly, Wensheng**'s popular "fried chicken" found himself standing in the shadow cast by OpenAI.
AI tools from text to ** are not new. In 2023, a number of text-to-** AI tools will be launched, and startups such as Runway, PikaLabs (hereinafter referred to as Pika), and StabilityAI will all attract attention in this track, with hot money and skyrocketing valuations.
Among them, RUNWAY was established in 2018, opened the GEN-2 internal test in March, and officially released it in June. Previously, there were Gen-1 images to **ai tools. Runway has also supported a number of films. One of the most famous is the Oscar-winning film "The Universe in an Instant".
Cristobal Valenzuela, CEO and co-founder of Runway, said: "We've seen an explosion of image generative models. I believe that 2023 will be the year of the best. ”
In May, the runway has completed 1$4.1 billion Series D financing, investors including Google, Nvidia, etc., the valuation soared threefold to $1.5 billion, PIKA was just established in April last year, A round of financing reached $55 million, the valuation was more than $200 million, and the first text to the ** product PIKA1 was released in November0。By December, it had more than 500,000 users and nearly a million pieces of content generated every week.
StabilityAI, on the other hand, is known for its text-to-image tool, Stable Diffusion, which has become a unicorn after receiving $100 million in funding in 2022. In November last year, Stable Video Diffusion was released, note that there is also a "Diffusion" suffix at the end, which is a generative ** basic model based on Stable Diffusion, which needs to be deployed and used by users themselves, and it is not yet a product released to the public.
Three startups, three important generative product models in 2023, were suddenly struck by SORA.
After OpenAI rushed to show Sora to the world, Runway's CEO Valenzuela posted on social platform X: "Game on. And Emad Mostaque, the CEO and founder of StabilityAI, refers to Ultraman as "Master Wizard."
Judging from the demo**, SORA does have the power to change the game landscape, and it is difficult to keep up with SORA's peers.
After the release of ChatGPT, although the AI wave has made investors enthusiastic, AI startups have sprung up like mushrooms, and new unicorns have sprung up one after another. But this is a story of two heavens, many AI startups have gone downhill, or seek acquisitions, or drastically lay off employees, or even die here. According to Zhidong, from November 2023 to January 2024 alone, four AI start-ups around the world, including AI news startup Artifact and AI medical company Olive, announced their closure.
According to The Infoemation, at least seven AI companies that develop generative ** have raised at least 5$500 million in funding. An investor privately told The Information that he had just missed the financing of a popular AI start-up before, and he was "glad" after seeing Sora's **.
Three. The artificial intelligence track is hot, but the threshold is also high, and in the fight, you may fall into the grave if you fall behind.
In a blog post, Huggine Face, a well-known AI start-up, mentioned three major challenges of text-to-**: the computing power challenge, which is accompanied by high computational costs to ensure consistency across frame spaces and practices, making it impossible for most researchers to afford the training costs of such models; Lack of high-quality datasets, with very few multimodal datasets for text-to-** generation and often lacking annotations; Instruction ambiguity, how to describe ** in a way that makes the model easier to learn is not an easy task.
Even StabilityAI, which seems to have a firm foothold, is in constant trouble. In June last year, Forbes released a long news that more than 30 former employees and investors of StabilityAI detailed the 9 major crimes of founder and CEO Mostak, including: stealing 1 billion in financing from StableDiffusion; concealment of financing difficulties; exaggerating the company's revenue; arrears of wages to employees; Falsification of academic qualifications and work history.
Aside from all else, it is surprising that Silicon Valley's "popular fried chicken" has difficulty financing and the company's revenue is inflated. According to people familiar with the matter, StabilityAI's monthly expenses are about $8 million, but Mostak once blew up that the company's revenue in August was about $1.2 million, and there is a chance of more than $3 million. Mostaq quickly deleted the post, but it doesn't smell bad from this number that it smells like burning money.
In November, the same month that it released its Stable Video Diffusion model and a year after the company closed a $100 million funding round, news of the resignation of several senior executives and the company's fragile financial position was rumored to be considered**, a news that was later denied by Mostaq.
With a lack of income and a brain drain, StabilityAI's crisis has exposed the weakness of glamorous hot startups.
The addition of giants will also make the war more intense. In the track of AI**, the giants have long had a layout. In October 2022, Meta and Google made efforts one after another. Meta first released the Make-A-Video model, and just a week later, Google CEO Sundar Pichai personally tamped two of their latest achievements in this area, Imagen Video and Phenaki, emphasizing quality and length, respectively.
*:meta ai
But neither Meta nor Google has yet to open up their texts to the public to the AI tool. Google officials believe that the data used to train the AI** model still contains problematic content that could lead to graphic violence or pornographic clips in Imagen Video that could have an undesirable effect. Everyone is familiar with the conservative operation of the giant, and in the track of natural language chatbots, Google also held the model early but did not launch the product to the C-side, and the reason is also security issues. But ChatGPT and Microsoft's alliance has brought the giant out of conservatism, and Sora may not be able to do the same.
There are already giants with new moves. A week before SORA's announcement, Byte announced personnel changes, and Zhang Nan, the former CEO of Douyin Group, announced his resignation as CEO of the group, focusing on the development of Jianying in the future. According to Times Weekly, citing people close to Jianying, Zhang Nan personally led the team to seek a breakthrough in AI-assisted creation, and is about to launch an AI-generated product.
In the face of the newly opened "game", Sora's peers can only go all out.
Just a few days after Sora's announcement, on February 22, StabilityAI officially opened the public beta of Stable Video, moving from a model to a product that is easy for everyone to use. Although the length is still relatively short, only 7 seconds, the quality is relatively high. Mostak was modest when promoting the new product on social platforms, saying that the reason for the openness was: "We wanted to create a large, open stable video2 similar to Sora, but we needed more data and calculations. ”
In addition, Midjourney, which has a high reputation in the field of Wensheng diagrams, also entered the market, and founder David Holz (D**Id Holz) revealed in Office Hours that the next version, that is, Midjourney V7, "may contain ** features".
OpenAI is still evaluating SORA at this stage, and it may be a few months before SORA is officially released to the public. The good news is that there is still time for your peers to adapt. The bad news is that time is running out for peers.
References: 1Smart Stuff: "AI Entrepreneurship ** Double Heavens: SORA Descends to the World Capital Carnival, Several Startups Collapse and Close".
2.Wired Insight: "Sora is coming, bytes are working: Zhang Nan's heavy tasks and challenges".
3.New Tinder: "Forced to Sell Himself, CEO**, Executives Leave, Another AI Unicorn Accident".
4.The Heart of the Machine: "Image generation is tired of volume, Google fully turns to text ** generation, two powerful tools challenge resolution and length at the same time".
5.Finance Associated Press: "The AI circle is not peaceful, and the well-known open-source model developer Stability AI was exposed to "seek to sell itself".
WeChat*** Alphabet List (ID: wujicaijing), Author: Bi Andi This content is the author's independent point of view and does not represent the position of Tiger Sniff. Do not do without permission**, please contact hezuo@huxiu for authorizationcom
People who are changing and want to change the world are all on Tiger Sniff app