Microsoft announced the launch of a 2.7 billion parameter language model, phi-2, and claims that its performance can match or better than models that are 25 times larger. "Demonstrated excellent reasoning and language comprehension, demonstrating state-of-the-art performance in a base language model with fewer than 13 billion parameters. ”
Its benchmark results show that with just 2.7 billion parameters, PHI-2 outperforms the Mistral and LLAMA-2 models at 7b and 13b parameters in a variety of comprehensive benchmarks. Compared to the 25-fold larger LLAMA-2-70B model, PHI-2 achieves better performance on multi-step inference tasks (i.e., coding and math).
In addition, the performance of the PHI-2 is on par with, or even better than, the recently released Google Gemini Nano 2.
There is also less "toxicity" and bias in the PH-2 response compared to existing open-source models that have been adapted.
Previously, Google's Gemini demo** demonstrated its ability to solve complex physics problems and correct students. Microsoft researchers also put the PH-2 on the same test and said it was equally able to answer questions correctly and correct errors using the same prompts.
PHI-2 is the latest version in Microsoft's Small Language Models (SLM) family. The first version is PHI-1 with 1.3 billion parameters, fine-tuned for basic Python coding tasks. In September, the company expanded its focus to common-sense reasoning and language understanding, launching a new 1.3 billion-parameter model, phi-15. Performance is comparable to models that are 5x larger.
Microsoft says the efficiency of PHI-2 makes it an ideal platform for researchers who want to explore areas such as enhanced AI security, explainability, and the ethical development of language models. Currently, PHI-2 is now available through the Model Catalog in Microsoft Azure AI Studio.