NetEase Shufan products.
This product was delivered by NetEase Shufan and participated in the selection of the "Data Ape Annual Golden Ape Planning Activity - 2023 China Big Data Industry Annual Innovative Service Product List Award".
Youshu Chatbi is a conversational data intelligent assistant based on NetEase's self-developed large model launched by NetEase Shufan in 2023, aiming to realize natural language dialogue as analysis, and obtain credible data through daily conversations, greatly reducing the threshold for data consumption and leading a new paradigm of data analysis. Youshu Chatbi is a brand-new solution in the field of data analysis, and it is also the first natural language conversational interaction product with a self-developed private model in China. Interacting with the BI platform through natural language is like talking to another person, so that even business operations that don't understand data can quickly retrieve data and realize self-service data query and analysis.
In the case that the current large model cannot be 100% accurate, Chatbi has introduced a large model to overcome AI illusion and instability in a credible way and make it available for production.
1. Retrieval enhancement technology to improve the adaptive ability of the model
The field naming and field value definition of data tables vary in different scenarios, and common LLM-based NL2SQL solutions write table creation statements (DDL) as the context to Prompt to inject table information into the model. However, with only field names and types of information, the large model's understanding of the data table is still not comprehensive, and it is easy to have problems such as selecting the wrong field and mismatching the format of the field values.
NetEase Shufan adopts retrieval enhancement technology, giving full play to the advantages of the BI system's fast and convenient table lookup, and splicing more relevant metadata into prompt for different problems, which significantly improves the model's ability to understand the data table. This strategy makes the large model have a wider "field of view" of the data table and has the ability to adapt.
2. Personalized knowledge configuration, adaptation to customized questions and questions
Different business parties will have their own industry "black words" or knowledge, for example, the operation students of cloud ** often check the data of the "nearest partition", but the large model does not understand the "nearest partition", you can configure the nearest partition as a prompt word: nearest partition = yesterday, so that the large model can understand the "nearest partition" in the user's question
In order to improve the adaptability of large models to such customized problems, NetEase Shufan provides the function of personalized knowledge configuration and corresponding adaptation algorithms, and different customers can configure the internal precipitation of knowledge and questions on their own needs, without retraining, and customers can create personalized chatbi products.
3. The model is self-learning, and the more you use it, the smarter it becomes
One of the major characteristics of LLMs such as ChatGPT is that the model can find its own problems and correct them by pointing out mistakes and taking them through conversations. Inspired by this, NetEase Shufan designed a model self-learning process, and the chatbi administrator recorded and corrected the current questions that were not answered well. The next time a similar question is asked, the model can regenerate the correct SQL from the corrected content, so that the more you use it, the smarter it is.
Youshu chatbi has become a powerful tool to promote the data culture of "everyone uses data, always use data". Everyone represents the scope of data use, which is no longer limited to a few managers, but every front-line employee involved in the operation of the enterprise; Timing represents the frequency of data use, and data is no longer only paid attention to during the month-end assessment, but as a tool that must be relied on at all times in the work.
For example:1) HR Team:In the past, they only had an IT system, and when they needed to do employee welfare and activity care, they either asked IT personnel to pull data, or they found their own treasure box to maintain the big excel, or the timeliness was difficult to meet the needs, or the data lagged. Using Youshu Chatbi as a data assistant, the HR team can well complete fragmented and temporary emergency demands, such as activity care, talent inventory, etc.
2) Business Leader:It can track business opportunities, contracts, revenues, and payments, understand the business situation more quickly, assist the business to formulate sales strategies, and adjust them in time; At the same time, when discussing business direction or product planning, he can analyze data trends under the combination of multiple dimensions and different perspectives, which is convenient for us to make timely decisions, rather than fixed reports.
3) Finance Team:Finance and internal audit are skeptical of all data that are not produced by themselves, especially the analysis data that investors pay attention to in quarterly reports, and there are multiple rounds of review. This kind of review and verification, cross-verification, and layer by layer review are very time-consuming, so they use chatbi to help them do a round of review, which can greatly improve efficiency.
With the help of NetEase's self-developed large model, Youshu Chatbi can meet the analysis demands of ordinary users with the characteristics of low threshold, high efficiency and intelligence. Achieve "dialogue is data", lower the threshold for data use, and realize that everyone can use data.
1) Lower threshold: With the natural language understanding capabilities of large models, users only need to have a conversation with the AI assistant to obtain data, improving the user's convenience.
2) Better efficiency: With the help of large models, users can understand user needs and convert from dialogue to database table lookup and visual charts, so that users can improve analysis efficiency.
3) Intelligence: From artificially designed rules and models to automatically learned rules, it can handle more complex and in-depth data analysis tasks.
The product panorama is as follows:
From the perspective of enterprise implementation, due to the serious "illusion" problem of AI large models, the answers given by AI are not necessarily 100% accurate, and this problem cannot be avoided. However, data analysis is a very rigorous scenario with extremely high requirements for accuracy, so in order to solve the illusion problem of the model, NetEase Shufan has made four major innovations in the product model to achieve the guarantee of "trustworthiness" of Youshu chatbi:
1. The demand is understandable
Youshu Chatbi uses large model capabilities to accurately understand users' natural language questions and understand business data to ensure that every question can be answered accurately.
2. The process can be verified
Make the query process as transparent as possible, clearly tell the user which data table the current query result is based on, and transform the complex SQL query process into a 100% correct structured expression, and tell the user in the vernacular what logic is used to obtain this data, even if the novice user who does not understand SQL can see at a glance whether the logic is correct or not, and can also verify the accuracy of the result.
3. Users can intervene
Assuming that the current answer given by the AI is wrong, the user can also intervene autonomously to switch to the correct data table and correct the structured query conditions to the correct one.
4. The product can be operated
Users can give the most intuitive feedback to the large model, mark and optimize the badcase of the query results, and iteratively improve the model, so that the larger model is used more and more intelligently. In addition, the common problems of each business can be preset in the background to build the business's own knowledge base.
The total number of customers or people using the product:
1) Within NetEase, there are several chatbi businesses such as NetEase Cloud**, which have covered non-technical personnel such as products, operations, marketing, and finance.
2) With the successful implementation of NetEase's internal products, after the release of several chatbi products, it attracted dozens of external customers such as Zhenyun Technology to try them.
1. Business value.
Taking NetEase Cloud** as an example, before the launch of the Chatbi product, the temporary data usage and data viewing needs of business users generally need to be undertaken by a data analyst and a dedicated person for data warehouse research and development, and it is scheduled to be solved. Not only does it consume a lot of manpower, but the demand response is not timely, and the lag of data demand also affects the business strategy to a certain extent.
After the ChatBi product is launched, users who need to retrieve data only need to get the data they want to see through conversations, with zero threshold and second-level demand response. Creates at least three major business values:
1) Greatly improve the demand and human efficiency of data query through natural language retrieval, with a total of 12,000+ times of fetching, assuming that each Q&A is saved by 02 man-days, saving 2000+ man-days for business;
2) With the help of intelligent data query solutions, non-technical students such as products, operations, and markets can try to explore and analyze data, empower more business personnel, and everyone has an exclusive intelligent data analyst;
3) Releasing data development manpower from high-frequency temporary data retrieval needs, helping development students focus on more core business, the cloud data warehouse team can accumulate a large number of data assets.
2. Socio-economic benefits.
With the successful implementation of NetEase's internal landing, after the product was released, it attracted dozens of external customers to try it. Customers in various industries use ChatBI products to revitalize a large number of existing data assets of enterprises, from enterprise CEOs to top-down personnel in front-line sales groups, they can freely obtain data, analyze and explore, opening up new ideas for enterprise digital construction and new solutions for more intelligent digital landing.
3. Technological influence.
As the first self-developed private model of natural language conversational interaction in China, the project team members participated in the formulation of the technical standard of "Large Model-driven Intelligent Data Analysis Tools" of the Academy of Information and Communications Technology, and applied for 3 patents, and another 5 patents are under application. NetEase Shufan will hold a product launch conference in August 2023 to officially release the online chatbi product. After the product launch, members of the project team shared the practice of ChatGPT in external activities such as the 112th China Computer Federation Technology Frontier (CCF TF), the Big Data Technology Salon jointly organized by NetEase and CSDN, and the Big Data Technology Seminar on NetEase in Shanghai.
Product owner: NetEase Shufan.
Relying on NetEase's more than 20 years of Internet technology accumulation, NetEase Shufan has launched self-developed and unbound cloud native, big data, artificial intelligence, intelligent development and other products, and provides customers with digital intelligence full-process services by building an open digital intelligence industry chain ecosystem.
At present, it has served more than 400 leading enterprises in finance, manufacturing, state-owned enterprises and other industries, providing customers with customized digital transformation solutions to help customers build exclusive digital intelligence competitiveness in the era of comprehensive digital intelligence.