Among the ten AI applications, five are office agents, three are AIGC, and two percent are rejuvenated digital people. "So, is the agent an AGI endgame product of a large model?
Author |Fighting
Edit|Pi Ye
Produced by |Industrialist
In April, researchers at Stanford and Google created a "westworld simulation" in which 25 agents perform human-like everyday behaviors, such as waking up and making breakfast and going to work, artists painting and writers writing.
These are the "AI agent experiments" that people are talking about today. In the second half of this year in China, players in the large model market seem to be turning to AI Agent, a clearly visible AGI endgame.
According to a set of data, as of mid-November, there were 13 financing events in the AI agent trackThe total financing amount is about 73.5 billion yuanThe average corporate financing was 565.4 billion RMB.
In foreign countries, this field is also hot. "At least 100 projects are working to commercialize AI**, and nearly 100,000 developers are building autonomous agents. Foreign media mattschlicht once said.
Why is AI agent so popular?
Regarding the imagination of AI Agent, a highly praised answer is:"A large language model can only make a snake, while AI agent can make a king of glory. ”
Mature AI agents can significantly reduce the cost of software production. In the future, there will be a lot of software and test solutions written by agents in the coding workflow, and they do not pursue long-term solutionsReusability, can be used and discarded. At present, a software industry giant often has tens of thousands or even 100,000 people, after having AI agentThe manpower and capital required for R&D and delivery will be greatly reduced. And it gives the software the flexibility to address more long-tail needs.
In addition, AI agents may build a framework for LLMs to think deeply and analyze to make more complex and reliable decisions.
In short, as Microsoft founder Bill Gates said, "Whoever dominates the personal assistant agent is the big deal." Because you're never going to search, you're going to be productive, you're going to Amazon. ”
It is worth noting that under this huge technological change, currentlyWe still haven't experienced the dividends and changes brought by AI agent. Obviously, there are still some difficulties in the development of AI agents.
Some questions worth asking are, what is the current status of the development of AI agents at home and abroad?What are the key points for the implementation of AI agents?And what does the future hold for AI agents?
1. The current situation of AI agent, overseas vs. local
At present, some domestic technology companies have produced several well-known large models, so the nurtured agent application has gradually entered the public eye.
For example, the Wenxin model is applied to intelligent search and autonomous drivingAli applied the Tongyi Qianwen model to AutoNavi Map, Youku, Hema and other products. Huawei applies its Pangu model to intelligent meteorology, voice recognition, and more.
A startup company called Facewall Intelligence has launched their AI agent product chatdev, which can complete the development of a software or a small game in a short time, and all the user needs to do is provide it with a requirement.
It is worth noting thatThe collaborative office field seems to be a "must-go" place for giants to do AI agents.
For example, in the DingTalk Magic Wand Kit, DingTalk AI product capabilities are brought together from chat AI, document AI, meeting AI, Yida AI, TeamBitionAI, etcThe "Meeting Assistant" feature in Tencent Meeting provides some intelligent support, such as automatic summarization of meeting minutes, transcription, and translationThe intelligent work platform Ruliu is equipped with a large model of Wenxin, which can realize functions such as intelligent creation and intelligent recommendationFeishu, an office software owned by ByteDance, announced the launch of its intelligent AI assistant "myAI", aiming to improve the efficiency of team collaboration.
Some investors have ridiculed **:Among the ten AI applications, five are office agents, three are AIGC, and two percent are rejuvenated digital people. ”This is not only the status quo of the development of AI agent in China, but also the fact that some foreign companies such as Google and Microsoft are also putting AI agent in the collaborative office scenario.
In fact, overseas, the concept of AI agent has gone through many stages from its emergence to its explosion.
InThe single agent stage is mainly for specific tasks in different fields and scenariosDevelop and deploy specialized agents. Take gptengineer as an example, give it a requirement, and it can write a rough **.
And the multi-agent cooperation stage, yesAgents with different roles work together automatically to complete complex tasks. For example, on metagpt, if you ask it to do a tool for ticket analysis, it will translate this task to five roles, including product manager, architect, and project manager, to simulate all decision-making workflows in the entire software development.
However, with the release of Microsoft's new tool, Autogen, AI Agent soon opened a new chapter.
Autogen allowsMultiple LLM agents solve tasks through chat. LLM agents can play a variety of roles, such as programmers, designers, or a combination of roles, and the conversation process solves the task. This is different from Metagpt, where the role model is defined, while Autogen allows developers to define their own agents and also allows them to talk to each other.
This is a new and creative agent framework. Within two weeks of Autogen's release, the number of stars skyrocketed from 390 to 10K and attracted more than 5,000 members on Discord.
Microsoft's layout in AI Agent is earlier. The release of Microsoft 365 Copilot in March 2023 prompted an LLM-based application development paradigm called the Agent. At present, Microsoft CopilotStudio has supported the seamless integration of custom ChatGPT assistants into daily office systems such as CRM, ERP, OA, etc.
It can be found that Microsoft's AI agent capabilities are mainly derived from its own businessAutogen is more like an outward release of its own business-based capabilities, which is not the same as OpenAI.
GPTS developed by OpenAI, as well as GPT-4Turbo and customizable AI Agent,So that everyone can build their own large model application. Many industry insiders believe that the ultra-low threshold for creation and the same business model as the App Store will allow OpenAI to quickly build a GPTS ecosystem.
OpenAI provides the ability to build basic agents, such as tool invocation and knowledge base file memory capabilities. The release of this product makes AI agent enter a new stage, that is, it provides a possibility for everyone to build their own agent.
It is worth noting thatAt present, AI agent architectures and products have emerged in many fields such as retail, real estate, tourism, customer service, human resources, finance, and manufacturing.
For example, Amazon Alexa, Aktify, Regie in the retail spaceAI, etc.;Epique, PropertyPen, ListingCopy, etc. in the real estate field;Agent4, EBI. in the field of customer serviceAI, JasonAI, AIDE, etc.;Autonomoushrchatbot, AiinterviewCoach, CareerSai, etc. in the field of human resources.
Overall,In terms of the underlying technology, architecture, and specific product applications, the AI agent is relatively complete. Tech giants like OpenAI, Microsoft, and Google have a first-mover advantage. Another phenomenon that can be seen is that there is still a gap between the depth and breadth of AI agents in China.
A question worth thinking about is, what is the key to the implementation of the agent?
Second, the key to the landing of the agent:
Model?Industry experience?Or is it a carrier?
Most of the agents on the market today, including the GPTS launched by OpenAI, are actually just building a chatbot based on a specific knowledge base or professional data. These agents are mainly used for Q&A interactions, such as obtaining industry information and reports.
However,There is still a lot of room for improvement in program linkage and operation. At present, we cannot directly use GPTS to operate ERP systems such as SAP or Kingdee, because this involves the application, authorization, maintenance and connection of API management software.
For enterprises, if AI agents such as GPTS are only used for knowledge answering questions, then their role will be very limited, like a toy, because it cannot currently penetrate into the business processes of enterprises.
There are many reasons behind thisIncluding model capabilities, industry experience, scene fit, etc., will affect the ability of the agent.
AI agents need to have the ability to perceive the environment, make decisions, and perform appropriate actions. The most important of these critical steps is to understand what is fed to the agent, reason about it, plan it, make accurate decisions, and translate it into executable sequences of atomic actions to achieve the end goal.
Currently, many studies utilize LLMs as the cognitive core of AI agents, and the development of these models provides quality assurance for completing this step. As a result, GPT-4-based agents behave more intelligently.
But for now,All large models, including GPT-4, still need to be improved.
"There are still big problems with the base model, and we have to wait for a better model for the AI agent to really land. ”An industry insider who is on the front line of large-scale model technology said to the industrialist.
However, in order to solve the problem of insufficient model ability, Zhipu AI & Tsinghua KEG proposed a fine-tuning method for aligning agent capabilities, AgentTuning, which uses a small amount of data to fine-tune existing modelsThe agent capability of the model is significantly stimulated, and the original general capability of the model can be maintained.
AI Agent's industry experience is also crucial to its implementation.
If a ** proposes some kind of different training method, OpenAI's internal Slack will scoff at it, because these are all leftovers for us to play. But when the new AI agents** come out, we will have a serious and excited discussion. This is a recent speech by OpenAI co-founder Andrej Karpathy.
In short,What kind of things we can make based on large models still depends on industry experience, and this is exactly what many large model giants such as OpenAI lack.
It is important to know that in order to introduce AI agent for process optimization, enterprises must go through strict and rigorous evaluation in many aspects such as cost control, investment budget, efficiency, and security control. That's itTechnology vendors are required to provide platform-level solutions, not just single-minded.
1. AI agent automation solutions provided for individual scenarios.
The introduction of new AI technologies by large enterprises does not allow for any trial and error costs, so the solution provided by the technology vendor must be a real agent digital employee with industry know-how terminology and business rules. Only such a standardized AI agent can be incorporated into the internal establishment of the enterprise for unified management and scheduling.
For example, an AI agent in the medical industry needs to have medical knowledge and be able to understand and process medical data. An AI agent in the financial industry needs to be financially literate and able to understand and process financial data.
The implementation effect of AI agent is also limited by application scenarios. In travel bookings, AI agents perform well thanks to issues such as rich APIs. However, in scenarios such as legal assistants, due to the frequent emergence of new knowledge and imperfect APIs, practical applications face more challenges.
This can be seen from the growth of domestic AI agents in collaborative office platforms.
In fact,The collaborative office platform itself has a good API interface and plug-in system, which makes it easier to integrate large models into existing tools.
In addition, many businesses and organizations are using collaborative office software, which means that large models can quickly reach a large number of potential users. A broad user base can accelerate the iteration and optimization process of large models to better meet user needs.
There are also a large number of data resources to help improve the performance of the model, and the rich scenarios can also promote the continuous improvement of large model technology.
DingTalk, Feishu and WeCom also have different advantages when they are used as agent carriers. DingTalk provides a complete organizational structure management function, which can easily create, manage and adjust the team structure, so that enterprises can quickly build an organizational structure that meets their needs.
Feishu emphasizes real-time collaboration and communication, and supports multiple people** to edit documents, discuss together, and other functions, which helps teams complete collaborative tasks efficiently. Its unique integration makes the entire office process more standardized.
WeCom is interoperable with WeChat, which makes it possible for its AI agent to provide more personalized and scenario-based services with the help of WeChat's huge user data and application scenarios.
From this point of view, it is natural for domestic AI agents to gather in the field of collaborative office. WhileIt is more important to find a suitable scenario or carrier for the implementation of AI agents.
However, in addition to collaborative office, there are many other carriers that may be more suitable for the landing application of AI agent.
For example, intelligent customer service, intelligent assistant, RPA, CRM, etc. Specifically, in terms of intelligent customer service, AI agents can automatically answer users' questions, handle complaints and suggestions, and improve customer satisfaction and efficiency. In terms of smart assistants, Apple's Siri, Google's Google Assistant, and Amazon's Alexa are all representatives of smart assistants.
In terms of intelligent process automation, many enterprises use intelligent process automation tools, such as UiPath, Blueprism, etc., to automate certain business processes.
When it comes to smart marketing, many marketing platforms have integrated AI agents, such as Hubspot, Salesforce, and more. The AI agents of these platforms can provide accurate marketing recommendations and ** through data analysis and machine learning technology, helping enterprises better understand customer needs and improve sales performance.
All in allModel capability is the core, industry experience is the key, and the carrier is the guarantee. Whether it is model capabilities, industry experience, or carriers, it is the key to the implementation of AI Agent. It is worth noting that the domestic software industry has forced domestic manufacturers to create a customized and personalized capability, which verifies the potential of domestic enterprises in technology landing, which will further promote the landing of agent.
3. What is the endgame of AI Agent?
In the "Westworld Simulation" section of the article, these agents can communicate with others and the environment (noticing each other's actions, initiating conversations or greetings), reflecting on these observations (forming unique personal opinions), and making daily plans. They have their own memories and goals that produce believable individuals and emergent social behaviors that are not achieved through pre-design.
For example, starting with a single task specified by the user, that is, an AI agent wants to hold a Valentine's Day party, the AI agents will spontaneously spread the invitation, meet new people, make appointments with each other to attend the party, and coordinate to appear at the party at the right time.
This is a representative application in the agent project. The reason why people were surprised by this project was that the interaction of the agent was something that was unexpected to humans. For a period of time after the outbreak of AI Agent, it was generally believed that AI Agent, which made up for the shortcomings of large models, was more practical and would be an important landing direction for large models.
As the construction of the agent becomes simpler and simpler, the maturity of the agent ecology will make the C-side agent bloom, and the agent will be more down-to-earth in the face of users, causing a new round of outbreak.
But for now, there are a number of problems with the commercialization of this pathway. Take the game scene as an example, the current charges are mainly for **game equipment, ** and other methods. However, the value of AI agent cannot be reflected in these inherent monetization channels. And judging from the current effect of agent landing,Without disruptive capabilities, it is impossible to know whether C-end users will pay for them.
What's more noteworthy is that with the blooming of C-side agents, their application value also tends to be smaller with the marginal effect. In other words,It will take time to verify whether AI agent can become the most core application direction for AI large models to explode from C-end commercialization. And even if it will become the most core application direction of the C-end commercialization outbreak in the future, its "lifespan" is not long.
A fact is,The final foothold of AI agent may be on the B side.
Bill Gates believes that agents, as the next platform, will influence how people use software and how it is written. It is better at finding information and summarizing information for users, and it will be able to find the best deals for users, and will replace search and e-commerce, as well as word processors, electronics, and other productivity applications. In addition, search advertising, advertising social networking, shopping, productivity software, etc., which are now independent, will all become the business of intelligent twins. The agent will revolutionize the way applications are opened.
Before these changes came,Compared with the impact of the agent itself, how to build an agent is a more worthy of attention.
On the agent construction platform, enterprises may be able to build their own RPA, CRM, office OA and a series of management softwareSoftware vendors can also build software based on this platform to provide services to enterprises.
For players who are in or about to enter the field of AI agentsFinding the entry point as well as a good business model is crucial.
In the future, the development of AI agents will not only be limited to single intelligence, but will also be extended to the linkage between the intelligence of things and robots.
From the perspective of swarm intelligence, TOC may form a larger community-based virtual organization, where each person's agent can be connected together through virtual dataTOB, on the other hand, may form virtual organizations and enterprises, and different enterprises and employees can be integrated into the network through agents.
Eventually,The whole society will become a huge network of virtual and real combinations, forming an "intelligent network". In this network, different agents will provide greater productivity, reshaping the entire production relationship, and thus increasing the productivity of the whole society.
Therefore, the development prospects of AI agents are very broad, and they will continue to expand their application scope and influence, bringing great changes and opportunities for the future social development.
To this day,Although AI Agent brings a lot of imagination, there are still many doubts. The road of technological development is full of doubts and criticisms, and technological change is an opportunity for any enterprise and individual, and the key is how to grasp it.