The non Transformer large model RWKV 5 World 7B was open sourced on January 28, and its English perf

Mondo Science Updated on 2024-02-01

Reporter Xiao Yulin.

On January 28, 2024, the RWKV Open Source ** will announce the open source RWKV-5-WORLD 7B model. The latest open-source model "RWKV-5-WORLD 7B" is the fifth-generation architecture 7B parameter model of RWKV, and it is also the most powerful open-source model of RWKV with multi-language performance to date. According to the performance evaluation data, it is 100% attention-free and only 1Under the premise of 1T tokens, the multilingual performance of the RWKV-5 7B model exceeds that of Mistral, and the English performance is on par with LLAMA2.

Starting today, developers and researchers around the world will be able to access and start using the RWKV-5-World 7B model through Hugging Face or the WiseModel platform.

According to the published test results, compared with many models with the same 7B parameters, the multi-language performance of the RWKV-5 7B is in the leading position, and the English performance has been greatly improved. In order to cross the LLAMA2 performance line and approach the MISTRAL performance line, the RWKV team said that it will invest an additional 1T tokens corpus to continue training the RWKV-V5 model.

According to the public information of RWKV, RWKV is an innovative deep learning network architecture, which combines the respective advantages of Transformer and RNN, and realizes highly parallel training and efficient inference at the same time, and the time complexity is linear complexity, which has the performance potential to be better than Transformer in long sequence inference scenarios.

Power Plant" learned that RWKV Yuan Shi Intelligent Company has completed a seed round of financing on January 16, and one of the investors is the Miracle Forum founded by Lu Qi. The RWKV model was originally designed by Bloomberg, and the main computing power was donated by institutions such as Stability AI and AI Eleuther. Today, RWKV has been donated to the Linux Foundation AI &Data as an incubation project.

RWKV has the advantages of both Transformer and RNN, and its main features include high and constant inference efficiency, low and constant video memory occupation, said Luo Xuan, co-founder and COO of Yuanshi Intelligence"The current efficiency of Transformer limits the development and industrial implementation of AI, and the birth of RWKV can and is reversing this situation. Over the past few decades, the open source of Linux has led to a boom in the internet. RWKV will follow the open source spirit of Linux. Both Transformer-based Infra and applications are worth redoing with RWKV. "

Related Pages