Soldiers are fast. In the battleground of large models, Google is always one step behind, so it has been repeatedly ridiculed for "getting up early in the morning and catching up late". But yesterday, Google made a big move and released an open model, Gemma, claiming that it is the "most advanced" open model series in the lightweight class, surpassing the original strongest Mistral 7B.
The launch of this model, which bears the Latin word for "jewel", is quite meaningful at this time. First, according to Google's official website, GEMMA uses the same research and technology used to create the Gemini model. But compared to the closed Gemini, the opening of GEMMA will undoubtedly attract more people; Second, there has been news that Meta may release the latest version of its LLAMA open-source model series in the near future.
So how exactly does GEMMA perform? To what extent is it "open"? What are the implications for the future development of open source models? Let's go through them one by one.
Currently, the GEMMA open model is divided into two versions, namely "GEMMA 2B" and "GEMMA 7B", i.e., 2 billion parameters and 7 billion parameters, to meet the different needs of developers. Both versions provide pre-trained models and instruction tuning variants.
Users now have access to Kaggle, Colab, and Google Cloud, as well as the option to infer and fine-tune GEMMA with multiple frameworks such as Hugging Face Transformers.
In addition, first-time Google Cloud users can get a $300 credit. Researchers can also apply for up to $500,000 in Google Cloud credits to accelerate their projects.
So, can the GEMMA model run natively like the open source models Llama2, Mistral, etc.? The development team members also answered this question in the affirmative.
In addition to supporting the Python neural network framework and GGML as options, we also provide a standalone version of the C++ implementation that you can run in ** and run locally. ”
To better support developer innovation, Google also offers a "Responsible Generative AI Toolkit" that goes with the model. The toolkit contains key tools to guide and support developers in building more secure AI applications with GEMMA.
According to Google's official blog, some other key details to keep an eye on include:
GEMMA via native keras 30, providing a toolchain for inference and supervised fine-tuning (SFT) in all major frameworks (Jax, PyTorch, and TensorFlow).
The release also includes ready-to-use Colab and Kaggle notebooks, as well as integrations with popular tools such as Hugging Face, MaxText, NVIDIA Nemo, Tensorrt-LLM, and more.
Pre-trained and instruction-optimized GEMMA models run on a variety of platforms, from laptops and workstations to Google Cloud, and can be easily deployed on Vertex AI and Google Kubernetes Engine (GKE).
Optimized for NVIDIA GPUs and Google Cloud TPUs for industry-leading performance.
The Terms of Use allow all organizations, regardless of size, to use and distribute responsibly for business.
Overall, GEMMA has been released at a close pace with Gemini 15。The context window of the latter expands to 1 million tokens. In just one week, the Gemini Ultra 10、gemini 1.5 Pro and GEMMA debuted for the first time, and such a fast release cycle can't help but make people pay more attention to Google's technological progress and product strategy.
Google claims that GEMMA has "the most advanced performance at its scale" and that "GEMMA significantly outperforms the larger model on key benchmarks". The basis for this claim is that GEMMA has outperformed LLAMA 2 in multiple benchmarks.
Source: Google Blog.
As you can see in the chart above, GEMMA has achieved better results than Llama 2 on important evaluation criteria including MMLU, Hellaswag and Humaneval.
Fran Ois Chollet, author of the deep learning framework Keras and AI researcher at Google, posted a more detailed comparison chart on X.
Source: Taking MMLU (Massive Multitasking Language Understanding) as an example, as shown in the figure, GEMMA 7B not only surpassed Llama 7B and LLAMA 13B, but also beat the popular fried chicken Mistral 7B.
In addition, in a dedicated technical report, GEMMA 7B evaluates different capabilities in terms of language understanding and generation performance compared to open models of comparable scale. The standard academic benchmark tests were divided into four groups: Q&A, Reasoning, Mathematical Science, and Coding according to their respective ability categories, and the average of the corresponding scores was calculated.
Source: Google Tech Report.
It can be seen that in the two groups of mathematics and coding, GEMMA 7B has obvious advantages; In terms of reasoning, GEMMA 7B narrowly wins; In terms of Q&A, GEMMA 7B is slightly inferior to LLAMA 13B.
Google's official blog post attributed Gemma's performance in terms of performance to the fact that "the GEMMA model shares technology and infrastructure components with Gemini, the largest and most powerful AI model we have in use today. This allows the GEMMA 2B and 7B to achieve best-in-class performance in their size compared to other open models. GEMMA is able to run directly on a developer's laptop or desktop computer. Notably, GEMMA significantly outperforms the larger model on key benchmarks, while adhering to our strict standards for safe and responsible output. ”
The release of GEMMA has generated a lot of discussion. Developers are generally concerned that GEMMA is an open model that does not seem to be licensed under a "open source" license in the true sense of the word.
Despite being called 'open source', the weights of the GEMMA model are actually released under a license that does not match the definition of open source. It has more in common with the source-available software, so it should be called the "weight-available model". This means that users can access and use the model's weighted files, but they may be restricted and do not comply with the principles of free distribution and modification under traditional open source licenses. ”
So what exactly does the "openness" of this kind of open model refer to, and to what extent?
As we all know, the weights of the mistral model are based on Apache 20 licenses, which means they follow open source principles. But the weight of the MLAMA 2 model led by Meta is released through a proprietary license that uses very targeted licensing: if the number of monthly active users exceeds 700 million, companies must apply for a license from Meta, and Meta will impose strict restrictions on such licenses, which means that large companies such as Amazon, Apple, Google, and ByteDance are restricted.
There are lessons learned from the Llama 2, so many people question the openness of the gemma. Google's wording in this regard is: "The Terms of Use allow all organizations, regardless of size, to use and distribute responsibly for business." ”
There has been speculation that this is Google's cautious approach to avoid repeating the mistakes of the past, influenced by restrictive clauses like this in the LLAMA 2 license.
It is clear that there is a significant divergence in the understanding and practice of "openness" in the field of AI. Some projects claim to be "open source", but in fact impose specific restrictions on users. This may be motivated by considerations such as intellectual property protection, market competition strategies, and avoiding technology abuse, but it also sparks discussions about how to define and implement true openness and sharing.
Interestingly, Google's release of GEMMA coincided with an article on its open source blog titled "Building Open Models Responsibly in the Gemini Era". As mentioned in the article, the open source license gives users complete creative autonomy. This is a strong guarantee that developers and end users have access to technology. But in the hands of malicious actors, the lack of restrictions can increase the risk.
Against this backdrop, "true openness and transparency, especially when it comes to training**, datasets, and unrestricted access to and use of model resources, remain goals that the AI community needs to strive for."
What does it mean that Google is releasing the GEMMA model as an "open model"? "Open models have free access to model weights, but terms of use, redistribution, and variant ownership vary depending on the model's specific terms of use, which may not be based on open source licenses. ”
The terms of use of the GEMMA model make it free for individual developers, researchers, and commercial users to access and redistribute. Users are also free to create and publish model variants. When using the GEMMA model, developers agree to avoid harmful use, reflecting our commitment to developing AI responsibly while increasing access to this technology. ”
Google says the definition of "open source" is invaluable for computing and innovation. However, existing open source concepts are not always directly applied to AI systems, which raises the question of how open source licenses can be used with AI. "It's important that we promote the principles of openness that make possible the great things we've experienced with AI, while clarifying the concept of open source AI and addressing concepts such as derivative works and attribution of authors. ”