The Spring Festival enlarged move, Ali Tongyi Qianwen Qwen1 5 open source release

Mondo games Updated on 2024-02-08

On February 6, Ali released Tongyi Qianwen 1Version 5 consists of 6 size models, "qwen" refers to the basic language model, and "qwen-chat" refers to the chat model trained by post-training techniques such as SFT (supervised fine-tuning) and RLHF (reinforcement learning human feedback).

Model overview

In this qwen1In version 5, we have open sourced 05b、1.6 bases and chats of different scales, including 8b, 4b, 7b, 14b, and 72b, have been released, and the corresponding quantitative models of each scale have been released as always.

Here are some highlights from this update:

Supports 32k context length; The checkpoint of the base + chat model is opened; Can be run locally with transformers; GPTQ int-4 int8, AWQ, and GGUF weights were released at the same time. Performance MeasurementBasic Competenciesqwen1.5Demonstrated excellent performance in multiple benchmarks. Whether it's in language comprehension, generation, reasoning ability, or in multilingualism and human preference production.

qwen1.The 5-72B far outperformed the LLAMA2-70B in all benchmarks, demonstrating its superior abilities in language comprehension, reasoning, and math.

Multilingual ability12 different languages from Europe, East Asia, and Southeast Asia were selected to comprehensively evaluate the multilingual capabilities of the BASE model qwen1The 5 base model excels in multilingualism in 12 different languages, with excellent results in assessments across various dimensions such as exams, comprehension, translation, and math, and can be used in downstream applications such as translation, language understanding, and multilingual chat.

Human preference alignmentDespite lagging behind GPT-4-Turbo, the largest QWEN15 Model qwen15-72b-chat showed good results on both mt-bench and alpaca-eval v2, outperforming claude-21、gpt-3.5-turbo-0613, mixtral-8x7b-instruct, and tulu 2 DPO 70b, on par with Mistral Medium.

lies in qwen15 Integration with HuggingFace Transformers library. From 437.Starting with version 0, you can use qwen1 directly using the transformers library native, without loading any customizations (specifying the trust remote code option).5. Load the model like this:

from transformers import automodelforcausallm# this is what we previously usedmodel = automodelforcausallm.from_pretrained("qwen/qwen-7b-chat", device_map="auto", trust_remote_code=true)# this is what you can use nowmodel=automodelforcausallm.from_pretrained("qwen/qwen1.5-7b-chat",device_map="auto")

Project address github:

Related Pages