Editor: Editorial Department.
Is OpenAI coming next?
Mistral AI, another center of the open source community, has just released its most powerful flagship model, Mistral Large, with performance directly comparable to GPT-4! (But alas, there is no open source).
Mistral Large has excellent logical reasoning skills and is capable of handling complex multilingual tasks including text understanding, conversion, and generation.
In many mainstream benchmarks, Mistral Large beats Anthropic's Claude 2 and Google's Gemini Pro behind GPT-4!
In the field of LLM, the landscape has changed again.
At the same time, just today, another blockbuster news in the AI circle was exposed: following OpenAI, Microsoft has also included Mistral under its command!
From the very beginning, Mistral has been full of legends. 4 weeks after establishment, 6-person team, 7-page ppt, 800 million financing (10.5 billion euros), which can be called a cool article into reality.
The founder, Arthur Mensch, was born in 1993 in France, and after working at Google for 3 years, he left Google at the age of 31 and recruited two developers of the LLAMA model to create a company that can compete with OpenAI and Anthropic in the future.
A team of several people, with very little financing, made a model that can wrestle with GPT-4.
Now, with the support of Microsoft, the owner of the fund, Mistral's next OpenAI title is confirmed.
mistral is not open source? Netizens panicked!
Now, in the spotlight of the world, Mistral is in the spotlight of the world, and every move is very eye-catching.
Some netizens found that mistral modified the ** content and deleted all the content that mentioned the obligations of the open source community, which immediately caused panic!
the previous homepage (left); The current homepage (right).
However, there is no need to worry too much about it at the moment.
According to foreign media interviews with Mistral CEOs, they will still adhere to the concept of open source in the future, but at the same time, they will also launch the most powerful closed-source model to participate in commercial competition.
Now they have completed the construction of the open source models named after scale: Mistral 7B and Mistral 8 x 7B to give back to the community, and build a product line that makes money with closed-source models named after Large, Medium and Small.
The most powerful model in Europe is here!
Having said that, the Mistral Large released this time can be said to be the most suitable large model for the European human body.
In a nutshell:- Mistral Large is fluent in English, French, Spanish, German and Italian, and has a deep understanding of their grammar rules and cultural backgrounds.
- Mistral Large is capable of handling the contextual content of 32k tokens, allowing it to extract information from huge documents precisely and quickly.
- Mistral Large is exceptionally precise at executing specific instructions, which allows developers to tailor content moderation policies to their needs – for example, Mistral AI uses it to conduct system-level moderation of LE Chat.
- Mistral Large natively supports function calls. This feature, combined with the output content restriction mode implemented by Mistral AI on La Plateforme, greatly facilitates the development of applications and the modernization of the technology stack.
Currently, this new flagship model of Mistral AI is only available on the Azure AI and Mistral AI platforms.
Among them, the pricing on Azure AI is as follows: The output is 0024 USD 1000 token, input is 0008 USD 1000 token.
The most impressive thing about Mistral Large is its super reasoning ability.
As a flagship model, Mistral Large has demonstrated impressive strength in terms of common sense, reasoning, and knowledge.
Although there is still a big gap with GPT-4, the Claude 2 and Gemini Pro 10 is basically already a defeated man.
As a large model of the Europeans, the Mistral Large outperformed the Llama 2 70B model in French, German, Spanish and Italian.
At the same time, it is also better than its own smaller-scale model.
When it comes to programming and math, Mistral Large's abilities are also outstanding.
Not only has it been greatly improved compared with other models, but it has also achieved good results in mainstream test benchmarks.
Little Cup is also coming
In contrast, the smaller Mistral Small is more focused on optimizing for latency and cost.
Compared to the Mistral 8x7B, Mistral Small shows better performance and lower latency, making it a solution that sits between the Mistral AI open source model and the flagship model.
Similar to Mistral Large, Mistral Small uses innovative technologies in RAG and function calling.
In addition, mistral has optimized its own service interfaces:
Open-weighted endpoints that provide competitiveness**, including open-mistral-7b and open-mixtral-8x7b.
Roll out new optimized model endpoints, including mistral-small-2402 and mistral-large-2402, while continuing to provide unupdated mistral-medium.
In cooperation with Microsoft's official announcement, the Mistral AI waist board is harder
In addition to announcing the model, Mistral AI also officially announced its in-depth cooperation with Microsoft.
And this is also Microsoft's second real investment in another top model company in the AI circle after OpenAI.
Although only founded in April 2023, Mistral AI has already had a significant impact on the AI landscape in Europe.
The release of the open-source models Mistral 7B and Mictral amazed developers and caused an uproar in the AI community.
Today's support from Microsoft has made more people believe that mistral is the next OpenAI.
Mistral AI is a French AI startup, and Microsoft's cooperation with it has undoubtedly allowed Microsoft to establish its own AI presence in Europe as well.
The goal of the collaboration between the two companies is to bridge the gap between basic AI research and real-world solutions.
If a multi-year partnership is established in the future, Mistral AI will have access to Microsoft Azure's AI infrastructure.
What Microsoft's blessing means for Mistral AI is self-evident.
Not only will the development and deployment of Mistral AI's next-generation LLM be greatly accelerated, but it will also open up new business opportunities. Based in Europe, Mistral AI will expand its influence to the global market!
Specifically, the collaboration between Microsoft and Mistral AI focuses on three key areas:
Supercomputing infrastructure: Microsoft will support Mistral AI with Azure AI supercomputing infrastructure for AI training and inference workloads.At the moment, the two companies have not disclosed financial details.Expanded Marketplace: Microsoft and Mistral AI will offer Mistral AI's advanced models to customers through the MaaS and Azure Machine Learning Model Catalog in Azure AI Studio.
AI R&D: Microsoft and Mistral AI will explore collaboration to develop proprietary models for select customers, even for European public sector workloads.
Recently, Mistral AI raised $4€500 million, led by technology investor Andreessen Horowitz.
However, compared to its competitors in the United States, Mistral AI is clearly not receiving much funding.
You know, OpenAI has received more than $10 billion in investment from Microsoft alone, and Anthropic has received $6 billion in funding from Google and Amazon.
In October last year, Google pledged to invest $2 billion in Anthropic, according to Wall Street**.
Therefore, as soon as this cooperation came out, the name of Mistral AI's European version of OpenAI was even more solid.
And for Microsoft, this investment is also a lot of benefits - it is an opportunity for it to gain a foothold in the European AI field.
Originally, as the sole provider of OpenAI models on EU servers in the Azure cloud, Microsoft was already leading the AI race in Europe.
However, AI is not treated as well in Europe as it is in the United States.
Many countries in Europe are conservative and critical of AI, especially when it comes to data protection.
And if it is a European AI model of a European server provider, it may be reassuring and a good remedy.
A legendary AI start-up founded 9 months ago that challenges the giants of Silicon Valley
The story of Mistral's team of 6 people, 7 pages of PPT, and 800 million financing at the time of the seed round of financing is worth telling.
At the beginning of 2023, Arthur Mensch, who is still working at Google, is just 30 years old.
A year later, he left Google to start his own company, which was valued at $2 billion in just nine months!
Mensch joined Google in early 2020 as a researcher at Deepmind, where his research focuses on improving the efficiency of AI and machine learning systems. He was 27 years old.
Later, together with two young people, Timothée Lacroix and Guillaume Lample, who had previously worked together on the LLAMA model, he decided to set up a company to build and deploy AI models in a more efficient way.
They believe that small teams can outperform large companies in Silicon Valley in terms of flexibility, and the open source model is a tool to achieve this.
Although he has raised more than $500 million from various investors, his company, Mistral AI, is still a little insignificant compared to Microsoft-backed OpenAI, Google, and even Anthropic.
These giants, as well as the giant unicorns they have heavily backed, have invested billions of dollars to build the world's most advanced AI systems.
But Mensch was not worried about competing with these behemoths.
Our goal is to be the most capital-efficient company in the AI space, says Mensch. That's why we were founded. 」
As for the just-launched Mistral Large model, he believes that this model can compete with OpenAI's state-of-the-art language model GPT-4 and Google's new model Gemini Ultra for certain inference tasks.
Mensch revealed that it would cost less than 20 million euros (about $22 million) to develop the new model.
The office of Mistral at its headquarters in Paris.
By comparison, OpenAI's CEO, Sam Altman, said at the time of the GPT-4 release last year that the cost of training his company's large model was close to $100 million.
And as they continue to shock the industry with the most efficient open source model in the industry, they have also gained endorsements from major companies such as Microsoft, Nvidia, and Salesforce.
The giants have also acquired a small stake in Mistral AI through cash or computing power.
With the release of Mistral Large, the bull they bragged about with a 7-page PPT 9 months ago has fully cashed in.
This is how the team of six people is formed.
While studying at the École Polytechnique and the École Normale Supérieure de Paris, Arthur Mensch became acquainted with two other founders, Timothée Lacroix and Guillaume Lample.
Both are on the Meta AI team, and Lample even led the development of Llama.
A couple of young people in their early thirties already have a lot of experience in the field of LLM development.
At the time, there were no more than 100 people in the world, even in the world, with expertise in building, training, and optimizing LLMs.
The other three are Jean-Charles Samuelian and Charles Gorintin, CEO of Paris-based health start-up ALAN, and Cédric O, former French Secretary of State for Digital Affairs.
AI scientists, how to start your own unicorn company
Tall and with thick dark hair, Mensch looks neither like the typical tech geek nor the usual CEO.
His friends and colleagues say that he always jokes lightly with friends while drinking beer.
As a sports enthusiast, he ran the Paris Marathon in less than three and a half hours a few months before he submitted his PhD** in 2018.
From an early age, Mensch was torn between academic pursuits and entrepreneurship. He was born in the western suburbs of Paris, the son of a physics teacher and a father who owned a small tech company.
The future CEO is a graduate of France's top mathematics and machine learning schools. His mentors describe him as an enthusiastic and engaged student who can quickly master projects that he has little to no foundation for.
I do love exploring new things, Mensch says. I get bored easily. 」
During his Ph.D., Mensch's research focused on optimizing software to analyze three-dimensional brain images of magnetic resonance imaging (FMRI) systems that could handle millions of images.
At the end of 2020, Mensch joined Deepmind, where he was involved in the development of large language models.
In 2022, he published the famous Chinchilla** as the lead author
This study redefines the relationship between the size of an AI model, the amount of data needed to train it, and its performance, known as the AI scaling law.
As the AI race heats up in 2022, Mensch expressed frustration that AI labs at large companies are publishing fewer research on large language models and sharing them with the research community.
After the release of ChatGPT, Google decided to accelerate the catch-up.
Mensch's team grew from a small team of 10 to 30 and eventually to a large team of 70 people.
I feel like I should have left before things got too bureaucratic, Mensch said. I don't want to develop those opaque technologies in big tech. 」
In its initial proposal to investors in the spring of 2023, Mistral criticized the emerging oligopoly dominated by American companies that develop proprietary closed-source models.
It was an important principle for Mensch and his partners to release their initial AI system as open-source software, allowing anyone to use or modify it for free.
It's also a way to attract developers and potential customers who want everyone to have more control over the AI they use.
Although Mistral Large, Mistral's current state-of-the-art model, is not open source, Mensch said
Finding a balance between building a business model and sticking to our open source values is very delicate. We want to create something new, a new architecture, but we also want to offer some additional products and services to our customers. 」