Stable Code 3B is a lightweight programming assistant that runs natively without a GPU

Mondo Sports Updated on 2024-02-04

Stability AI recently released Stable Code 3B, a lightweight programming assistance model that brings together several innovative technologies. While remaining lightweight, it exhibits performance comparable to that of larger models such as Codellama 7B, a feature that allows it to run in GPU-free environments, greatly broadening its range of applications.

At its core, Stable Code 3B, a 3 billion parameter programming assistance model, is the ability to run locally on a laptop without the need for a dedicated GPU. This feature not only lowers the barrier to entry, but also provides developers with more flexibility. Compared to large models such as Codellama 7B, Stable Code 3B is 60% smaller in size, but shows comparable performance for a variety of programming tasks.

Advanced techniques and strategies are used in the training process of Stable Code 3b. The model is trained based on Stable LM 3B, of which the number of training tokens of Stable LM 3B is as high as 4 trillion. In addition, Stable Code is specifically trained on specific data from software engineering, making it more accurate and effective when dealing with programming-related tasks. In terms of model architecture, Stable Code 3B adopts a decoder-only Transformer architecture, similar to the LLAMA architecture, but with some key tweaks. For example, rotational position embedding in position embedding is applied to the top 25% of the header embedding dimension to improve throughput; An improved version of the GPTNEOX Tokenizer is also used to train the FIM (Fill in the Middle) function.

The training set of Stable Code 3b consists of multiple open-source large-scale datasets, such as Falcon RefinedWeb, CommitPackft, etc. The training process was on a cluster of Stability AI, using 256 NVIDIA A100 40GB GPUs. It uses the GPT-NEOX branch for training, and combines Flash-Attention, Swiglu and other technologies. In terms of performance, Stable Code 3B achieves SOTA performance on the MultiPL-E benchmark, especially in multiple programming languages such as Python, C++, and J**Ascript. This performance is due to the innovative technology and optimization strategies used in the training process.

The launch of Stable Code 3B is undoubtedly a major breakthrough in the field of programming assistance. Not only has it been successful in its lightweight design, but it is also comparable in performance to larger models. For developers, this means efficient and convenient programming assistance services, even in resource-constrained environments. The launch of Stable Code 3B heralds the rise of lightweight models in the AI field and lays a solid foundation for future development trends.

Related Pages