IT Home reported on December 21 that Apple's artificial intelligence researchers said that they have made a major breakthrough and successfully deployed large language models (LLMs) on Apple devices with limited memory through an innovative flash memory utilization technology. The result is expected to bring more powerful Siri, real-time language translation, and cutting-edge AI capabilities to photography and augmented reality to future iPhones.
In recent years, LLM chatbots like ChatGPT and Claude have taken the world by storm. They are capable of engaging in fluent conversations, writing different styles of text, and even generating**, demonstrating great language comprehension and generative skills. However, these models have an Achilles' heel: they "eat" data and memory so much that ordinary phones simply cannot meet their operational needs.
In order to break through this bottleneck, Apple researchers have blazed a trail by turning to the flash memory that is ubiquitous in mobile phones, where apps and ** are stored. In the title, "LLM in a Flash: Efficient Large Language Model Inference with Limited Memory," the researchers propose an ingenious flash utilization technique to store the data of the LLM model in flash memory. The authors point out that flash memory has a much larger capacity in mobile devices than the RAM traditionally used to run LLMs.
IT Home notes that their approach cleverly leverages two key technologies to bypass throttling, thereby minimizing data transfer and maximizing flash memory throughput:
Windowing: Think of it as a way to take advantage of it. Instead of loading new data each time, the AI model reuses some of the processed data. This reduces the need for frequent memory reads, making the entire process smoother and more efficient.
Row-column bundling: This technique is similar to when you read a book, not word for word, but paragraph by paragraph. By grouping data more efficiently, data can be read faster from flash memory, accelerating AI's ability to understand and generate language.
*Note that this technology allows AI models to run at twice the scale of the iPhone's available memory. With this technology, LLMs are 4-5x faster on Apple M1 Max CPUs and 20-25x faster on GPUs. "This breakthrough is critical for deploying advanced LLMs in resource-constrained environments, greatly expanding their applicability and accessibility," the researchers wrote.
Breakthroughs in AI efficiency have opened up new possibilities for future iPhones, such as more advanced Siri capabilities, real-time language translation, and sophisticated AI-powered features in photography and augmented reality. The technology also lays the groundwork for the iPhone to run sophisticated AI assistants and chatbots on its devices, which Apple is said to have already begun to work on.
The generative AI developed by Apple may eventually be integrated into its Siri voice assistant. In February 2023, Apple hosted an AI summit and introduced employees to its large language model work. According to Bloomberg, Apple's goal is to build a smarter Siri that is deeply integrated with AI. Apple plans to update the way Siri interacts with messaging apps to allow users to work more effectively on complex problems and complete sentences. In addition to this, Apple is also rumored to be planning to add AI to as many Apple apps as possible.
Apple is reportedly working on its own generative AI model, codenamed "AJAX", designed to compete with OpenAI's GPT-3 and GPT-4 with 200 billion parameters, suggesting a high level of sophistication and robust capabilities in language understanding and generation. Internally known as "Apple GPT," AJAX aims to unify Apple machine learning development, suggesting that Apple is integrating AI more deeply into its ecosystem.
According to the latest reports, AJAX is considered to be more expensive than the earlier ChatGPT 35 More powerful. However, some sources have also pointed out that OpenAI's new model may have surpassed the capabilities of AJAX.
Both The Information and analyst Jeff Pu claim that Apple will offer some form of generative AI capabilities on iPhones and iPads around the end of 2024, when iOS 18 is released. PU said in October that Apple will build hundreds of AI servers in 2023 and more in 2024. According to reports, Apple will provide a solution that combines cloud-based AI and on-device processing AI.