Groq is a startup founded by Google's TPU team, who have launched a new type of self-developed chip, the LPU (Language Processing Unit), which is used for large model inference acceleration. What is very strong is that this chip is 20 times faster than NVIDIA GPUs in terms of inference speed, while reducing the cost to one-tenth. The GroQ chip is manufactured on a 14nm process, equipped with 230MB of large SRAM, and has an on-chip memory bandwidth of 80TBS.
The LPU essence of GroQ is to overcome the two main bottlenecks of large language models (LLMs): computational density and memory bandwidth. Their solution delivers faster LLM inference performance than other cloud platform vendors. GroQ already supports models such as Mixtral 8x7b, LLAMA 2 7b and 70b, and provides API access and demos. Their goal is to surpass Nvidia within three years. It's so strong!! No, no, no!! Is it too fast!! Can't stop the car at all!! Let's take a look at the test results below!!
gpt-3.The speed of 5 is 40 tokens and then the speed of GPT-4 and Gemini is about 80, look at the time it takes for them to complete a simple ** debugging problem, the speed is actually very fast, but GroQ completely crushes GPT4 and Gemini, and the output speed is 10 times faster than Gemini and 20 times faster than GPT-4. I can't imagine it!! Is there anything faster???
Huawei's Ascend and Cambrian AI SC chips are expected to replace NVIDIA's strong GPU to achieve corner overtaking, because Google's ASIC has an advantage, and it is possible to replace Nvidia's GPU later.
Lao Huang wants to sell 7 trillion US dollars to buy all GPUs, such a big market always has to be topped by competitors, and now he is facing not only friends, but also domestic chip manufacturers such as Huawei on the coast! I hope that our domestic country will ride the wind and waves and get a share of this revolution!
Friends who like this article can click "Follow"!
**10,000 Fans Incentive Plan