Zhigong News The first open source model in China s industrial field is here!

Mondo Technology Updated on 2024-01-31

Intelligent industrial model

On December 29, China Industrial Internet (Beijing) Technology Group announced the open-source 1.6 billion parameter lightweight large model of Zhigong, becoming the first open-source large language model in China's industrial field. This move marks China's continuous innovation and progress in the field of large models, providing more choices and possibilities for the industry.

Zhigong (zhigong-1.)6b) is a lightweight open source model in the industrial field, using 32T of high-quality corpus training.

The base product is oriented to edge computing and intelligent terminals,1The 6b parameters make the model parameters lighter. The base product provides a highly flexible pre-trained framework that can be extended to industrial equipment, smart devices, and industrial products to provide more efficient computing performance for industrial application scenarios.

The base product uses byte-pair encoding (BPE) to segment the data, and realizes the language enhancement technology for 18 languages, including Chinese, English, French, Russian, Spanish, Cambodian, Czech, Japanese, and Korean. In addition, an additional 10,000 tokens will be introduced for the languages of the "Belt and Road" countries other than English.

Based on its lightweight design, the base is capable of handling specific tasks in the industrial sector. According to the person in charge of R&D of Zhigong Industrial Large Model, compared with the traditional large model, its lightweight design aims to improve performance while reducing resource requirements, making the model more suitable for deployment and application in industrial scenarios.

For industrial scenarios, the R&D personnel reconstructed the data screening process. Zhigong's 1.6 billion parameter lightweight model is 32T of high-quality Chinese, English, and first-class data were trained, which greatly increased the proportion of books and first-class data.

In a number of benchmark evaluations, Zhigong's 1.6 billion parameter lightweight large model showed excellent performance. The model excels in industrial text understanding, task processing, and professional domain Q&A, providing a more efficient solution for industrial applications.

MMLU stands for Microsoft Multimodal LU, which was proposed by Microsoft Research in 2021 with the goal of establishing a unified framework for evaluating the effectiveness of different multimodal fusion models on multimodal language understanding tasks. Its evaluation includes diverse tasks, unified measures, multiple languages, rich training data and benchmark models, etc., and is currently recognized as one of the most authoritative evaluation frameworks in the field of multimodal language understanding.

In the design of open source large models, China Industrial Internet pays attention to lightweight technological innovation. This not only makes the model more suitable for edge devices and resource-constrained environments, but also improves real-time performance in scenarios such as industrial production lines. In the test, this large model product can also complete the work in the airplane mode (disconnected) state of the mobile phone.

The open source of Zhigong's 1.6 billion parameter lightweight model will further promote the digital transformation of China's industrial field. Based on this model, companies can develop custom industrial applications to improve productivity and intelligence. Small and medium-sized enterprises and research institutes will also benefit from this open source initiative, with more opportunities to apply large models in the industrial sector. At the same time, this innovation also provides useful experience and inspiration for other enterprises and research institutions when designing large models.

The company has open-sourced model parameters, configuration files, tokenizers, etc. on Hugging Face.

Introduction of the person in charge of R&D of Zhigong industrial large model: Python 3 is required for model installation8 and later, pytorch 1For versions 13 and later, CUDA recommends 114 or above versions. The intermediate archive of the model is in the process of collation and upload, and will be available soon**. In the future, CAIC will further open source the multimodal series model bases with 7 billion, 1.3 billion, 30 billion and 70 billion parameters to provide more original support for the research of industrial large models.

Model Address 1: Model Address 2:

Related Pages