The iFLYTEK Spark V3 5 experience is close to GPT 4 overall

Mondo Technology Updated on 2024-02-01

On January 30, iFLYTEK held the iFLYTEK Xinghuo Cognitive Model V35 Upgrade press conference, at the meeting, iFLYTEK Spark v3Version 5 was officially released, which is the first national open large model trained on the national computing power platform "Feixing No. 1".

Liu Qingfeng, chairman of iFLYTEK, said that iFLYTEK Spark V35 Overall is already close to GPT-4 levels; Its ability to interact with mathematics and language is better than GPT-4TURBO. **96% of GPT-4 Turbo and 91% of GPT-4V's multimodal comprehension.

In addition, iFLYTEK also released the Xinghuo voice model for the first time, which is the world's leading mainstream language, leading the human-computer interaction revolution in the era of the Internet of Everything. At the same time, the Spark open source model "Spark Open Source-13B" was also released for the first time, which is deeply adapted to domestic computing power and helps developers, universities, and enterprises to develop independently.

Then the latest iFLYTEK Spark v35 How does the version actually work? Today, we will experience it with you.

1. AIGC core competency experience.

For large models, the most critical thing is naturally the core dialogue capabilities of AIGC, so we will mainly test these capabilities first.

When testing, use ChatGPT's latest GPT-4 to and iFLYTEK Spark V35 versions for comparison, take a look at iFLYTEK Spark v35 Is it really possible to catch up with ChatGPT. Unless otherwise specified, both are web-based by default.

Without further ado, let's get started.

1. Full voice interaction.

At this press conference, the most impressive thing was iFLYTEK Xinghuo v35 The full voice interaction ability, you can directly use voice to talk to iFLYTEK Xinghuo, just like chatting with a real person, which is very amazing, so let's evaluate the experience from this point first.

At present, the full voice dialogue can be experienced on the app, after the upgrade, there is an icon of a robot assistant on the far right of the bottom input box, and you can enter the full voice interaction state by clicking it.

iFLYTEK Spark v35's full voice interaction is very smooth, first of all, the voice of the AI is very natural, and there are even mood words such as "um" and "inside" when speaking, which is almost indistinguishable from a real person. Secondly, when interacting, iFLYTEK Spark v3The 5 is also very responsive, and after speaking, it will pick up quickly. Moreover, iFLYTEK Spark v35's answers are also very accurate, contextually connected, and there are no answers that are not asked, in short, "interaction", more like natural "communication".

GPT-4 aspect,It should be noted that,Due to the objective reason of unstable network connection during use,There will be frequent reconnection during the conversation、Long waiting time,Aside from this,GPT-4The voice dialogue is also relatively good,The voice is also natural and smooth like a real person。

iFLYTEK Spark v35 The full voice interaction currently has two timbres: Ling Xiaoyue (female voice) and Ling Feiyi (boy) can be switched.

2. Language comprehension.

In terms of language comprehension, let's first test the two large models by asking them to try to analyze the thoughts and emotions expressed in the following paragraph:

Where is the road to redemption for all unfortunate fate? If wisdom and understanding can lead us to the path of salvation, can all people attain such wisdom and understanding? I often think that ugly women make beautiful women. I often think that fools cite wise men. I often think it's a coward who shines on the hero. I often think that sentient beings have transformed the Buddha.

iFLYTEK Spark v35 and GPT-4 each give the following understanding:

Judging from the answers, iFLYTEK Spark v3Both 5 and GPT-4 are acceptable, but on the whole, GPT-4 is more accurate and detailed.

Let's make it a little more difficult for them to analyze the following ironic couplet:

The reeds on the wall are top-heavy and shallow; Bamboo shoots in the mountains, with thick skin at the tip of the mouth and a hollow belly.

iFLYTEK Spark v3Both 5 and GPT-4 give a perfect score:

Then increase the difficulty and test the two large models with a yin and yang weird words:

The company went to work at 09:00 in the morning, and I came to the company at 09:00 to check in, and I happened to meet the boss, and the boss saw me. said: "Dear Mr. Wang, you are really punctual, our company has such punctual employees as you, and the performance is not sloppy?" What does this mean, boss?

For this question, iFLYTEK Spark v35 Misunderstood, didn't get into the sarcasm of the boss's words:

Comparatively speaking, GPT-4 answered better and understood the sarcasm in the boss's words.

Another sentence of yin and yang weirdness:

I really envy your **, so well maintained.

For the understanding of this sentence, this time iFLYTEK Spark v35 Accurately grasped the sarcasm and sarcasm in it:

GPT-4 also recognizes that there is sarcasm in this, but it gives a wrong understanding of what it is satirizing:

After testing, iFLYTEK Spark v35 and GPT-4 have each other's wins and losses in Chinese language comprehension, and the overall can be said to be at a level, most of the hidden meanings behind the language can be identified, and the comprehension ability is still satisfactory.

3. Logical reasoning.

Then test iFLYTEK Spark v3Version 5 and GPT-4 logical reasoning ability, some test questions for logical thinking training are selected. The first one is:

Suppose you have a pond with an infinite amount of water in it, and there are two empty kettles with a volume of 5 litres and 6 litres respectively. Q: How do you get 3 liters of water from a pond with these two jugs?

For this question iFLYTEK Spark v3The steps of the answer 5 are clear, the logic is clear, and there is no problem with the actual operability.

GPT-4 is a list of steps to answer this question, but according to the method it gives, you can't get 3 liters of water.

Then I found another topic:

A, B, C, and D play a game of chess, and each of them has to play one game, and A wins D, and A, B, and C win the same number of games. Q: How many wins did D win?

For this question iFLYTEK Spark v35 and GPT-4 have different ideas, but they both give the correct answer:

And then there's the question:

There are three classes in the fourth grade, each with two class leaders, and only one class leader participates in the class meeting. First-time attendees were A, B, and C; The second attendees were b, d, e; The third attendees were A, E, F. Which two class leaders are in the same class?

This problem, iFLYTEK Spark v35 gives a correct and complete answer:

GPT-4 also gives the right answer, and the idea is clear.

When it comes to logical thinking, there are some questions similar to brain teasers, which can also test the thinking and reaction ability of large models, such as the following question:

If 1=7,2=17,3=27,4=37,5=47,6=57 then 7=?

This problem, iFLYTEK Spark v3Neither 5 nor GPT-4 can identify the confusion condition in the question and give the wrong answer:

Another topic with a thinking trap:

You race, and when you pass 2nd place, how many places are you?

This problem, iFLYTEK Spark v3Both 5 and GPT-4 managed to avoid the pitfall, answering "first" instead of "first" but "becoming the new second".

Overall, in terms of logical thinking ability, iFLYTEK Xinghuo v3Both 5 and GPT-4 have shown very good logical reasoning and pit avoidance capabilities, among which iFLYTEK Spark V35 In the first question, there was a small victory.

4. Answer questions in mathematics.

We have tested the logical reasoning ability of two large models before, and similar to it, there is actually the ability to answer mathematical questions, which can further test the "IQ level" of the large model.

Let's take a look at the following question first:

In abc, a,b,c are the sides opposite the inner angles a,b,c if 2asina=(2sinb+sinc) b+(2sinc+sinb) c. (1) Find the size of a; (2) Find the maximum value of sinb+sinc.

iFLYTEK Spark v35 The first quiz is answered correctly, but the second quiz is wrong, and the maximum value should be 1

GPT-4 On the other hand, neither of the two questions was successfully answered.

Then I found another question:

The assessment requirements for a middle school for girls to stand in the long jump are: 133 meters get 5 points, with each additional 003 meters, the score is increased by 5 points until 1After 84 meters get 90 points, add 01 meter, the score is increased by 5 points, the full score is 120 points, if a girl has a score of 70 points before training, after a period of training, the score is 105 points, how many meters has the girl increased in the long jump after training?

iFLYTEK Spark v35 The correct answer is given, and the process of solving the problem is also given:

GPT-4 only gives the correct answer at first, and only after asking about the problem solving process does it give detailed steps.

Finally, try a slightly more difficult question:

Knowing the functions f(x)=e x-ax-1 and g (x)=kx 2, when a>0, find the range of f (x).

For this question, iFLYTEK Spark v35. The correct answer is given, and although the process of solving the problem is simple, the idea is relatively clear.

GPT-4 gives a relatively long solution step, but the result is wrong.

The three questions in the above examples were all found from a mathematical simulation test in the third year of high school, which can be seen in iFLYTEK Xinghuo v3The mathematical ability of 5 has reached at least the high school level, and in the actual use process, it can indeed be found that iFLYTEK Spark v35 is still better than GPT-4 in solving math problems. But overall, there is room for improvement in both.

5. Text generation.

Text generation is probably the most common feature that people use when using large models to assist us in some copywriting. Two large models were also tested here.

First of all, I want them to help me write a recruitment copy:

Recruitment requirements: have a professional background in economics, have work experience in media, excellent writing, and be able to travel frequently. Recruitment treatment: There are five insurances and one housing fund, the monthly salary starts from 15k, the working environment is new and elegant, there are gifts on holidays, and a trip team building once a year. Copywriting requirements: The style is light and humorous, within 500 words.

iFLYTEK Spark v35 There are basically no deductions for the copy given:

GPT-4's copywriting is also good overall, but the language is compared to iFLYTEK Spark V35 Not lighthearted and humorous enough.

Look at the story solitaire again, and start with the most classic story and let them continue:

After the apocalypse, I became the only survivor on Earth, and I was sitting in my room in a monologue, when I suddenly remembered a knock on the door.

iFLYTEK Spark v3The story continuation of 5 and GPT-4 can be logical, fluent, beginning and end, and there are some details to describe, which are relatively good.

Friends in the workplace often need to write some plans, activity plans, etc., and at this time, you can also use the content generation ability of large models to help you complete tasks faster. Here, the IT home takes "our company plans to carry out a reading activity and help me write an activity plan" as the demand to conduct the test.

The plan given by iFLYTEK Xinghuo is relatively complete, with time, place, goal, process, preliminary preparation, result evaluation and other links, and there is no lack of details, and the availability is very high.

The GPT-4 proposal is relatively concise, with less detail, but it is also more complete.

Overall, in terms of text generation, iFLYTEK Spark v35 is on par with GPT-4 and there are no problems with the content generated.

6. Ability.

Using AI large models to assist in writing is also a common use case for some programmers, which can also be regarded as an important component of the content generation capability of large models.

When testing, first test the two large models with the following question: Please use C to generate the following **: give you a string s, find the longest palindromic substring in s. If the reverse order of a string is the same as the original string, the string is called a palindromic string. **Please follow the template below: Public Class Solution }

Based on the criterion that the large model can be used directly, the large model generated by the program running tool is tested to see if it can run directly and perfectly. Since I don't understand **, I also found a programmer from the IT home to assist in the evaluation.

First of all, let's watch iFLYTEK Xinghuo v35. The format standard it gives and the algorithm are relatively concise, which looks refreshing.

I got it to the detection tool to run the test, and found that this ** can be run directly, and the output result is accurate, that is to say, it can be used directly.

GPT-4 Here, the ** given also has a standardized format, which is also relatively concise, and there are comments.

If you run it in the detection software, it can also run successfully, and the performance is also good.

In addition to being able to write**, you must also be able to analyze**, so then find a paragraph** for them to answer what this ** is for:

# python 3def remove_common_prefix(x, prefix, ws_prefix):x["completion"]=x["completion"].str[len(prefix):]if ws_prefix:#keep the single whitespace as prefixx["completion"]=" " + x["completion"]return x explains what this ** is for.

Xunfei Xinghuo's answer concisely and clearly explains the main function of this **, and the answer is accurate.

GPT-4 also gives this implementation function, which is also fine, and at the same time, it also points out a small error in **, which is the problem of non-standard quotation marks at the end, where GPT is slightly better.

To sum up, the current iFLYTEK Spark v3Both 5 and GPT-4 are very capable, and there is basically no difference in the level of the two.

7. Industry knowledge.

Finally, let's test the mastery of industry knowledge of the two.

Let's start with a topic in the field of chemistry:

Which of the following statements about lanthanides is FALSE? (a) The most common oxidation state of mace is +3. (b) Antimony complexes often have high coordination numbers (>6). (c) All californiums react with water-soluble acids to produce hydrogen. (d) The atomic radius of the maczes increases gradually from la to lu in the periodic table of macze.

iFLYTEK Spark v3Both 5 and GPT-4 give the correct answer. Among them, iFLYTEK Spark v3The answer to 5 is relatively straightforward, and GPT-4 is a bit more detailed.

Then ask them another question about medicine:

What is the valve attached to the perimeter of the left atrioventricular orifice of the heart?

iFLYTEK Spark v3Both 5 and GPT-4 give accurate answers.

In terms of knowledge, it is also necessary to consider the mastery of the latest information by large models, that is, the update of their knowledge base. Here are a few questions to test.

Start by asking "When was the Apple Vision Pro released?" ”

iFLYTEK Spark v35 gave the correct answer, and also gave a brief introduction to this product. The explanation of the knowledge base is very new, which is quite a surprise.

GPT-4 didn't answer it directly, and then asked a question related to sports:

What team is NBA star Chris Paul on now?

iFLYTEK Xinghuo gave a correct and complete answer:

GPT-4 still didn't answer, pointing to the search engine.

Overall, in terms of industry knowledge, iFLYTEK Xinghuo V35 In terms of the depth of knowledge mastery, it is basically the same as GPT-4, but in terms of the update speed of the knowledge base reserve, the current iFLYTEK Xinghuo V35 is significantly better than GPT-4.

8. Multimodal capability.

In this iFLYTEK Spark v3The multimodal capabilities have also been significantly improved in version 5, so let's finally test how well it performs in multimodality.

The first is the basic Wensheng diagram ability, first let them draw a "Monkey King havoc in the heavenly palace", iFLYTEK Xinghuo v3Both 5 and GPT-4 were quick to give the drawing, and both were fairly compliant.

But on the whole, GPT-4's paintings are a bit more elaborate and detailed.

Then there is the ability of Tushengwen, find a **, and see if they can identify the jokes in the **.

iFLYTEK Spark v35 accurately gives the joke in **, and also judges that this is a scene in "Cat and Mouse", but at the same time, there is an explanation of the elements that are not in **.

GPT-4 can also accurately see where the joke in ** is, and does not generate superfluous information, but does not point out that this is a scene from "Cat and Mouse", and overall has its own advantages and disadvantages.

In terms of multimodal experience, there is another function that everyone pays more attention to, that is, ** generation. Here is an attempt to make two large models generate a paragraph about Superman.

iFLYTEK Spark v35 Soon generated a small ** introducing Superman, and there is also a virtual digital person in charge of explaining, which is very good.

GPT-4 is not currently supported**.

In general, at present, in terms of multimodal capabilities, iFLYTEK Spark v35 is also very comprehensive, and the actual experience of using it is also very good, compared to GPT-4, it can be said that each has its own merits, and it is between the best.

Overall, after multiple versions of technical iterations, the current iFLYTEK Spark v3Version 5 has few problems in terms of basic functional experience, and is comprehensive and mature to use.

2. Experience of other basic functions.

Finally, let's take a look at the situation of iFLYTEK Xinghuo in terms of other basic function experience, IT Home mainly explains from the two aspects of terminal coverage and function richness.

In terms of the richness of terminal coverage, iFLYTEK Xinghuo has always been relatively leading, as early as June last year, iFLYTEK Xinghuo v1When the 5 is upgraded, it has achieved full coverage of Android, iOS, applet, PC, and H5, so everyone can experience the iFLYTEK Xinghuo large model in mainstream devices.

In terms of ChatGPT, it currently covers the web terminal, mobile terminal, PC and Mac, Linux, and there is no applet and H5, which has its own advantages over Xunfei Xinghuo.

In terms of functionality, the current iFLYTEK Xinghuo is also very comprehensive. For example, in the previous v1The iFLYTEK AI assistant function introduced on version 5 provides specialized services and functions for specific application scenarios, covering various scenarios such as "workplace, life, travel, writing, fun, and emotion", and even you can create your own AI assistant.

It can be seen on iFLYTEK Xinghuo that at present, various types of Xinghuo AI assistants are still very comprehensive, and almost all application scenarios can be covered.

There is a similar feature on ChatGPT.

In addition to the Xinghuo AI assistant, iFLYTEK Xinghuo also has a unique iFLYTEK companion function, you can send specific knowledge, historical conversations, or what you read, write, think and think every day to the system, and customize your exclusive AI personality "Friend", users can experience the AI personality of the iFLYTEK Xinghuo APP "not only knowledge, but also personality".

For another example, iFLYTEK Xinghuo also has a wealth of plug-in functions, including PPT generation, email generation, resume generation, operation copy generation, mind map, AI interviewer, etc., which are very complete.

These are capabilities that GPT-4 does not currently have.

Epilogue. Previously, Liu Qingfeng, chairman of iFLYTEK, said in an interview that iFLYTEK Xinghuo will fully benchmark GPT-4 in April 2024.

And from this time to the new iFLYTEK Spark v3Judging from the experience of version 5, it can indeed be on par with GPT-4 in terms of comprehensive ability, and even has a certain degree of leadership in logical reasoning, mathematical ability, knowledge base update speed, etc.

In short, iFLYTEK Xinghuo cognitive model v35 Let us see the unlimited development potential of domestic large models in terms of technology and application, and look forward to the continuous evolution of iFLYTEK Xinghuo in the future, so that our AI large model technology and application ecology can truly achieve international leadership.

Related Pages