When ChatGPT was released in November 2023, it was only accessible via the cloud due to its massive size.
Today, I am running a similarly powerful AI program on a Macbook Air without any overheating issues. This reduction in size demonstrates how quickly researchers are streamlining AI models to be more efficient. It also proves that increasing scale is not the only way to enhance machine intelligence significantly.
The AI model currently running on my laptop, providing ChatGPT-like responses, is called Phi-3-mini. It belongs to a series of smaller AI models recently introduced by Microsoft researchers. Despite its compact size suitable for smartphones, I tested it on a laptop and accessed it from an iPhone using an app named Enchanted, which offers a chat interface similar to the official ChatGPT app.
In a research paper outlining the Phi-3 model family, Microsoft’s researchers claim that the model I utilized performs comparably to GPT-3.5, the OpenAI model powering the initial ChatGPT release. This assertion is based on its performance across various standard AI benchmarks assessing common sense and reasoning. In my personal testing, it appears to be just as capable.
At its annual developer conference, Build, Microsoft unveiled a new “multimodal” Phi-3 model capable of processing audio, video, and text. This announcement followed OpenAI and Google’s promotion of innovative AI assistants built on multimodal models accessed via the cloud.
Microsoft’s diminutive AI model family indicates the potential to develop various practical AI applications that do not rely on the cloud. This could lead to new use cases by enabling more responsiveness and privacy. (Offline algorithms are a crucial component of the Recall feature Microsoft introduced, utilizing AI to make all PC activities searchable.)
However, the Phi family also sheds light on modern AI and potential improvements. Sébastien Bubeck, a Microsoft researcher involved in the project, explains that the models were designed to explore whether being more selective in an AI system’s training data could enhance its capabilities.
Traditional large language models like OpenAI’s GPT-4 or Google’s Gemini, which power chatbots and other services, are typically fed vast amounts of text extracted from various sources. While this practice has raised legal concerns, OpenAI and others have discovered that increasing the volume of text and computational resources during training can unlock new functionalities.