Artificial intelligence technology (AI) has undergone more than 70 years of development since it was proposed in the fifties of the last century, but artificial intelligence has become a basic technology that the whole society is concerned about, and then has become a key strategic development direction that countries around the world attach great importance to, or in the last decade. In this short decade, artificial intelligence technology has developed exponentially, and the accelerated evolution of artificial intelligence has led all walks of life to accelerate their leap to intelligence.
The emergence of modern artificial intelligence first started with perceptual intelligence based on deep learning technology, which we call AI10 times. The so-called perceptual intelligence is to make the machine really have the same visual, auditory, tactile and other perception capabilities as a human, which is the most successful application of artificial intelligence in the field of security, such as the detection and analysis of specific objects of interest. On the other hand, people are clearly not satisfied with AI just seeing, hearing, and processing information, but expect machines to learn, think, and reason like humans, and this is where the realm of cognitive intelligence comes in. In the past two years, with the rise of large models, the pace of artificial intelligence technology towards cognitive intelligence has been greatly accelerated, and there has been a rapid development. Predictably, AI2The 0 era will be a dividing line, 20 and below we think of as traditional deep learning models, 2After 0, more large-scale model technologies continue to emerge and will gradually become dominant. Of course, AI20 is not the end of artificial intelligence technology, looking forward to the future, we hope that machines can truly become artificial brains, replacing human decision-making and judgment, which is what we call AI30 in the era of decision-making intelligence. There is still a long way to go for AI technology to reach this peak.
The development of modern artificial intelligence has brought new vitality and opportunities to security manufacturers like Keda. Keda launched the first perceptual camera in 2014, which represents our entry into the modern artificial intelligence track from perceptual intelligenceIn 2017, Keda put forward the strategy of "security + AI";In 2020, Keda took the lead in proposing the concepts of AI pixel-level inference, AI ultra-low-light, and AIISP in the industry, leading the technology trend in the industryIn the past two years, Keda has actively invested in the research and development of large models. Keda's nine years of technology precipitation based on modern artificial intelligence have brought great benefits to the company's security products, business services and solutions, and one star product after another has emerged, which not only creates economic value for Keda, but also ensures social public safety and produces good social benefits.
Perceptual intelligence refers to mapping the signals of the physical world to the digital information world with the help of cutting-edge technologies such as artificial intelligence algorithms, structuring multiple data, and communicating and interacting in a way familiar to humans. The ultimate realization of this wish was made possible by the emergence of deep learning algorithms a decade ago, bringing what we call AI10 technological revolution.
Deep learning algorithms learn features through a large amount of data training, and continuously optimize the model, which can handle complex data structures and nonlinear problems to achieve high-precision classification and classification. Its biggest advantage is that there is no need to manually design and select features, as long as there is data, the machine automatically learns the best features of the target, which greatly reduces the manual intervention in the process of machine perception and provides the most important basic support for the machine to realize perceptual intelligence.
In the security monitoring industry, the most common deep learning algorithm is the analysis of people, vehicles, objects and other objects of interest. For example, human-related portrait recognition, vehicle-related vehicle damage analysis, non-motor vehicle identification and attribute analysis, and analysis of other specific objects, such as ships, animals, text, etc. At present, these deep learning-based perceptual intelligence algorithms have become an indispensable tool in various applications in the field of security, which not only greatly reduces the workload of humans, but also through perceptual intelligence, machines can see more clearly and analyze more accurately than humans. Hence the AI1The perception intelligence algorithm of 0 has occupied an important position in security applications.
Perception is just one of the basic human abilities, and people hope that AI can have the most important cognitive ability of humans to do more for us. Taking security applications as an example, we are not satisfied with AI only analyzing a target of interest in the scene, but hope that the machine has a certain understanding of the environment around the scene, that is, we hope that the machine can have preliminary cognitive intelligence capabilities.
The so-called cognitive intelligence is a technical science based on the human cognitive system, with the deep understanding of perceptual information and the deep understanding of natural language information as the main research direction. As shown in Figure 1, for the security industry, these two research directions have very direct applications.
First of all, let's look at the application of deep understanding of perceptual information in the field of security, which is different from AI1Analysis of specific targets such as people, vehicles, and objects in the 0 era, AI20 In the field of security, there are many applications of pan-monitoring scenarios, such as crowd situation, security events, data parameters and even health environment analysis in traffic, politics and law, urban management, campuses, construction sites and other scenarios. These applications emphasize more on the interrelationship between objects and targets and the surrounding environment in the scene, so artificial intelligence needs to have a certain ability to understand and recognize, which we call scene image understanding. Some typical examples include: traffic incident analysis such as pedestrian crossing, obstacle detection, illegal parking, etc.;Analysis of road safety incidents, such as flooding, snow, fog, fire, etc.;Analysis of municipal governance events, such as people gathering, road occupation, littering, smoky vehicles, etc.
It is very difficult to use traditional deep learning algorithms to complete these pan-security scene understanding tasks, due to the diversity of these scenarios and the variability of tasks, it brings many problems such as difficult data collection, difficult labeling, high labeling cost, poor algorithm generalization performance, insufficient robustness, long algorithm delivery time, and poor scalability. Hence the AI1Perceptual intelligent algorithms in the 0 era are difficult to be used for in-depth understanding of scenarios.
Another direction of cognitive intelligence in the field of security is the understanding of industry knowledge based on natural language information. The industry solutions provided by security manufacturers to users must have a deep understanding of industry knowledge, which is traditionally described by natural language text and generated corresponding knowledge graphs to bear. For example, the public security knowledge graph is to extract people, things, places, institutions, virtual identities and other entities in public security applications through data analysis, text semantic analysis and other means, and establish mutual correlation according to the attributes, time and space, semantics, features, location connections, etc., to build a multi-dimensional and multi-layer relationship network. Another example is the knowledge graph of the judiciary, which systematically sorts out the entities, attributes and relationships in the legal field, establishes logical associations, and assists legal workers to better understand and apply legal knowledge through data mining through knowledge graph technology. Security manufacturers often must master the knowledge graph of related industries in order to provide professional industry solutions.
However, the construction and application of the knowledge graph itself has many challenges and is difficult to promote. First of all, it is necessary to establish a high-quality knowledge graph construction model, define clear entities and relationships, and use appropriate data sources and knowledge representations, all of which involve a lot of manual experience and judgmentSecondly, the construction of knowledge graph requires large-scale automatic knowledge acquisition, and the knowledge graph needs a large amount of knowledge as the foundationIn addition, it is necessary to continuously update and iterate the knowledge graph, and the knowledge itself is not closed, but constantly expanding and changing, and the knowledge graph is difficult to deal with incomplete or dynamically changing knowledge. The difficulty of building and applying these knowledge graphs means that a new way to make knowledge accessible to machines needs to be found. Obviously, this is beyond the realm of perceptual intelligence.
It can be seen that for the two in-depth understanding directions of cognitive intelligence in security applications, AI1The algorithm of 0 is no longer enough, so what is our countermeasure?This is the GPT model. The so-called GPT large model refers to the generative model (GPT's 'G') based on the Transformer architecture, which is a large model with a large number of parameters (such as OpenAI's GPT-4 large model, with nearly a trillion parameters) that is pre-trained (GPT's 'P') through a very large number of corpus or images. After hard research and development, Suzhou Keda has launched KD-GPT suitable for the security industry, which contains three types of large models, namely: multi-modal large model, industry large model and AIGC image large model, as shown in Figure 2.
GPT has many advantages as a generative AI model, the most basic of which is the following two points: First, large models are multi-tasking. In the previous deep learning model, one model corresponded to one task, but now the large model can handle multiple downstream tasks with one modelSecondly, thanks to the structural characteristics of transformers, large models already have the ability to retrieve and understand information, which is also a feature that traditional deep learning models do not have.
The KD-GPT multi-modal large model (see Fig. 3) takes the information of multiple modalities as input and fuses the information inside the model, which can solve all the difficult data problems. For example, if we want to detect whether there is smoke in a picture, we only need to enter the prompt: "smoke" and the ** to be detected, and the large model can directly output the detection results without collecting thousands of smoke** for training. Similarly, if the alt text input is "flame", the large model can automatically detect the presence of flame in ** without special data training.
The industry model of KD-GPT (see Figure 4) adopts the idea of "general large model + industry data + training tuning", so that this large model can easily solve several tasks that need to be completed to build the industry knowledge graph, and after doing a good job in knowledge reasoning and quality evaluation, the industry large model can completely replace the function of the industry knowledge graph.
Keda's AIGC image model, another type of large model, can also generate a large amount of training data for artificial intelligence algorithms.
From cognitive intelligence to decision intelligence, important progress needs to be made in data, learning, multimodal data processing, decision tree models, and personalization algorithms. These advances will help usher in the era of a fully intelligent society.
Large models are one of the important tools to realize decision intelligence, because they have efficient data processing capabilities and strong feature engineering capabilities.
However, we must also face up to a number of problems with large models:
1.Impartiality
The large model is based on the pre-training of a large amount of data and the continuous adjustment of the improvement training, how can we ensure that the large model is not biased by the pre-training data and the promotion?
To ensure the impartiality of the large model, we need to use multiple ** data for pre-training, which can avoid the model from relying too much on a certain type of data, so as to improve the generalization ability of the model. At the same time, for the selected training data, it is necessary to invest in manual cleaning and labeling to ensure the quality and reliability of the data.
2.Transparency
Large models are essentially neural networks, and the transparency and interpretability of neural networks have not been effectively solved. How can the decision-making process of the model be evaluated and monitored?
The academic community has been actively working on the problem of large models and neural networks, such as training a helper model to evaluate the performance of the main model. For example, the output of each layer of the large model is analyzed in detail to find the commonality of the rules.
3.Inclusion
Whether it is training or deployment, the cost of large models is very high, how to do large models for security companies like Keda?For customers of different sizes of Keda, how can we get a large model that can be used by everyone?
In order to reduce the cost of using large models, it is necessary to consider using pre-trained models, selecting appropriate model architectures, and using distributed computing for each enterprise and user monomer. More importantly, enterprises and all aspects of society should work together to allow more people and enterprises to easily obtain and use large model computing resources through the construction of computing infrastructure, the creation of computing service platforms, the establishment of computing power sharing mechanisms, the promotion of large model technology research and development and the formulation of large model incentive policies, which requires the joint efforts of the whole society.
4.Friendliness
How to avoid the so-called ** information provided by the large model (such as instigating crimes), and how to protect the intellectual property rights of human beings or the intellectual property rights of the large model from being stolen and infringed by other large models?
When designing and using large models, we should first follow ethical and legal norms to avoid adverse consequences for society and individuals. At present, the state has preliminarily formulated and promulgated laws and policies related to generative AI to regulate the development and use of generative AI, limit the scope and method of use of large models, prevent them from being abused and infringe on the rights and interests of others, and ensure that they are in the public interest. On the other hand, respecting and protecting intellectual property rights is also the basis for the development of large models, and the state should strengthen the protection of intellectual property rights for large models to encourage innovation and technological progress.
To sum up, although the emergence of large models is called the second revolution of AI, there is still a long way to go for large models to reach maturity on the curve of technology maturity. Not only that, in order to achieve real decision-making intelligence in the field of security, large models will not be the only key technology, and artificial intelligence needs to continue to develop innovatively, which is a long way to go.
With the rapid development of science and technology, technological change has become an important force to promote the progress of the security industry. From analog to digital, from standard definition to HD, from wired to 5G, from functional to intelligent, every technological change has brought new and huge growth to the security industry. It is foreseeable that with the continuous development of technology, the security industry still has sufficient momentum and a broad future. We should actively embrace various technological changes, including artificial intelligence, make full use of the advantages of new technologies, improve the efficiency and quality of security work, and jointly promote social harmony and stability. Suzhou Keda is willing to work together with friends in the field, as well as upstream and downstream enterprises, and work together for the future of the security industry
Wen Zhangyong, Suzhou Keda Technology Co., Ltd. ***