Provide 1000+ AI efficiency tools丨Welcome to pay attention.
AI Singularity.com-AI Tools Special Issue丨March 5th.
The AI competition on the other side of the ocean is becoming more and more intense, and involution and tearing are also intensifying.
Just after the Spring Festival, Google released a new generation of Gemini 15 models, providing million-level tokens contextual window throughput, violently crushing GPT-4 eight streets.
Last Friday, Elon Musk, the founder of artificial intelligence startup XAI, sued OpenAI and its main management, including Ultraman, in his personal name, demanding that the latter "restore" the original intention of the company's establishment - open source large models.
New hatred and old hatred are superimposed, and the offensive of competitors is becoming more and more fierce. The praying mantis catches the cicada and the yellow finch is behind, and there are still people secretly speeding up.
On March 4, local time, Anthropic, an artificial intelligence startup founded by OpenAI's defector teammates, quietly released a new generation of large language model Claude 3.
Anthropic has long been considered one of OpenAI's strongest rivals.
Its core team is the founding team members of OpenAI, which parted ways because of different routes.
The release of the Claude 3 this time is a new generation model that the company is most proud of.
The lonely GPT-4 ushered in a real rival this time
It is also a multi-modal large model that supports the interpretation of images and documents. They directly threw out a technical report of more than 46 pages, and the high-profile Wang Po sold melons-
Comprehensively surpasses GPT-4!
A data report for the assessment was also released to back it up. Interestingly, Google's latest gemini is also really miserable. In other words, GPT-4 is the industry benchmark.
It's not fun to slap your face like this...
In addition to the general text thrust leading, in terms of multimodal understanding, the claim also overtakes the visual large model GPT-4V.
Listening to the official blowing of Claude, netizens were happy.
There are also fun people who directly started to make memes, tweeting (X) to question Ultraman - when will GPT-5 be released?
Similar to Google's Gemini, the new generation of Claude 3 is divided into three versions, namely Haiku, Sonnet, and Opus.
Judging from the parameter size of the model, it can be understood as a medium cup, a large cup, and an extra-large cup.
Sonnet is free for everyone to use.
Although Anthropic does not give the specific training parameters of the model, it does give a rough scenario for the three sizes of the model:
haiku: It's the most responsive model and the least expensive option, and it still performs pretty well on most text-only tasks, as well as multimodal capabilities (such as visual recognition).Judging by the official statement of Anthropic, it is clear that beyond GPT-4 refers to the size of OPUS.SONNET: Suitable for scenarios that need to balance performance and cost, it performs on plain text tasks on the same level as OPUS later, but is more economical in cost, suitable for enterprises and individual users who need slightly better performance but have limited budgets.
OPUS: Strong reasoning, mathematical, and coding abilities close to human comprehension, suitable for scenarios that require highly intelligent and complex task handling, such as enterprise automation, complex finance**, research and development, and more.
Starting this week, Claude will be open to 159 countries – anyone who knows it, the Ring of China.
Among them, the APIs of OPUS and SONNET models have been launched, and developers can use them directly.
The most comprehensive version of Sonnet is free to try by signing up and logging in, while OPUS is available to Claude Pro subscribers for $20 per month, which is on par with GPT-4.
Now that the official has put the words out, let's test it!
Claude 3 hands-on experience: GPT-4 best replacement
Again, practice makes the real chapter.
However, all assessments have their own subjectivity and limitations, so the results are for reference only, so please experience them for yourself.
First of all, Claude claims that his training dataset is only up to August 2023, and EVA is somewhat skeptical of this reply.
Because I immediately asked a question about the grudge between "Musk and Ultraman", and it didn't answer.
Although Claude is okay with the identity of the two, the process is serious nonsense.
Obviously, the AI hallucination phenomenon is still prominent.
Next, I ask a killer question that tests AI chatbots in particular
Jack Ma, Pony Ma, Marx and Musk, do they all have the surname "Ma"?
This question,Ordinary AI robots can't answer it out of ten.,In my impression, only GPT-4 and Kunlun Wanwei's Tiangong passed the test.。
The answer given to me by Claude 3 was by far the most perfect one.
Detailed, detailed, and logically clear.
Let's ask the conventional social question again: why is Chinese football getting worse and worse?
In addition to the first three points that everyone knows, the last three points are the first thing I didn't think of, which is justified.
Let's ask another recent hot topic in the AI industry: The gap between China and the United States in AI is mainly in **?
Claude 3 gives a full and detailed answer, and clearly points out the key to the problem - talent, computing power, and data advantages. In addition, it also highlights the lag of domestic enterprises in AI commercialization.
Next, let's examine the multimodal capabilities of Claude 3.
I uploaded a framed photo of Musk and Ultraman for him to decipher their relationship.
Claude 3 recognized Musk, but not Ultraman, maybe in the training dataset, he is not famous enough?
One thing to say, Ultraman came out of the circle later.
Put another science diagram of the atmospheric water cycle, Claude 3 has a strong analytical ability.
Take a screenshot of Claude's own website and let it generate a web source**, and it's stress-free.
Guess the name of the place, did you guess it in front of the screen?
Only provide examples of the work and let it reason about the creative artist behind it.
Bingo, the answer given is also accurate, it's hard to beat it!
Finally, it should be mentioned that Claude 3 also has a limit on the number of questions asked in a time period, and after playing for a while, Eva's number of questions will be used up. And there are too many people using it today, and the response is very slow.
Judging from the subjective feelings experienced so far, the Claude 3 does reach the average GPT-4. As for you, you have to say crush or surpass , because there are still too few questions in the test to be objectively demonstrated.
In addition, Claude 3 and GPT-4 have their own strengths and weaknesses in answering some questions, and it is difficult to say who is better and who is necessarily the best.
And from an objective point of view, GPT-4 is a relatively mature commercial model, which pays more attention to stability and reliability, while the fledgling Claude 3 is not.
Claude 3 is a big breakthrough in technology, but commercialization is the difficulty
Although Anthropic has once again shown its technological strength, after a long year, the technical competition of large models has moved from the battle of routes to the battle of commercialization.
How to move from the model to the actual product landing and create greater commercial value is the sword of Damoris hanging over the head of developers.
Anthropic is behind Google and Amazon, OpenAI is backed by Microsoft, from a practical point of view, if Anthropic cannot achieve a commercial breakthrough this year, it will be gradually separated from OpenAI, and it is not even ruled out that it will be acquired by the owner.
In 2024, Anthropic is still under great pressure.
According to the latest report from The Information, Anthropic's expectation for investors is that it is expected to generate more than 8$500 million annualized return.
In comparison, OpenAI is now able to bring in $1,316 billion a month. With the blessing of Microsoft, the pace of commercialization of OpenAI is still accelerating, which has also directly triggered Musk's dissatisfaction and lawsuits.
The arms race for AI systems has just begun, and the launch of a large model that works well is only the first step in a long march.
[**丨ai singularity network丨The whole network account has the same name丨Welcome to follow].
AI Singularity丨Provide 1000+ AI efficiency tools丨Welcome to pay attention.