Microsoft's desktop AI development environment offers an early preview that lets you build small language models that run on PCs and mobile devices.
Microsoft used the developer-focused section of its Ignite 2023 event to introduce a range of AI development tools. Azure AI Studio supports large-scale AI development for cloud-hosted applications using Azure OpenAI models or other models, while Copilot Studio extends the legacy Power Virtual Agents low** AI tools with "enhancements" powered by OpenAI.
Microsoft also announced a third tool, but it took a while for it to appear on the developer's computer. The tool is Windows AI Studio, which is now available for preview. Let's take a look.
Windows AI Studio aims to bring a library of AI models from Microsoft and its partners to PCs, now using GPUs, but eventually also using onboard AI accelerators such as Arm and Intel NPUs in Microsoft's latest Surface hardware. These NPUs were originally delivered in Surface Laptop Studio 2, where I wrote this column. With DirectML's support for Intel NPUs integrated in these and other devices set to expire in early 2024, this option should appeal to developers and other users.
Windows AI Studio is designed to help you train and customize your model to be ready for use in **. Once trained, you can use the ONNX (Open Neural Network Exchange) cross-platform runtime to transform models for use in desktop and mobile applications. Windows AI Studio, available as an extension of Visual Studio**, will allow you to have many different tools and AI models in one place, working with other tools, so you can refine your model as you build it. .NET application.
Windows AI Studio offers an interesting mix of Windows and Linux tools that work on CPUs and GPUs, using the Windows Linux Subsystem (WSL) to host and run models. This approach does require powerful hardware, ample memory, and a recent GPU. You won't be able to use Windows AI Studio without a discrete GPU, which can be a workstation-class card or an external GPU that works through a Thunderbolt connection.
Windows AI Studio Installation and Prerequisites.
Windows AI Studio is very simple to install. You can download it from the Visual Studio Marketplace, where you can also find quickstart instructions. Note that by default, the Visual Studio Marketplace view in Visual Studio Code is set to install the release version, so you may need to switch the view to a pre-release version. Once you've made the change, it's easy and quick.
There are some important prerequisites. You'll need an nvidia gpu and wsl, running at least ubuntu 18Version 4 as the default linux. Once Windows AI Studio is installed, it will be checked if Conda and CUDA are supported in the WSL environment in order to use the GPU. If it is not installed, Windows AI Studio offers a one-click option to ensure that all the must-have libraries are in place.
This uses Visual Studio Code's remote server option to load and run the installation script. If you want to see how it's doing, open Visual Studio Code's built-in terminal and switch to its Output view. It may take a while to install as it will ** and install the relevant libraries. Expect it to take at least five minutes, or longer if you have an old computer. Windows AI Studio documentation is currently only on GitHub;Microsoft Learn only displays placeholder pages.
Once installed, Windows AI Studio adds a new chip-like icon to the Visual Studio extension sidebar. Click this button to launch the Windows AI Studio development environment. At startup, it will check if your development environment still meets the necessary prerequisites. Once the check passes, and any updates are made to the WSL configuration, the extension loads a what's new page and populates its action pane with its current set of features. In the latest preview, you can see four different actions, and more are planned. However, there is only one that works at the moment, and that is the model fine-tuning action.
Other options in the plan include Retrieval Enhanced Generation (RAG), a playground that works with Microsoft's PHI-2 base model, and access to a library of ready-made models for services like Hugging Face. Using PHI-2 will allow you to build and train your own small language models without relying on cloud-hosted services like Azure OpenAI.
RAG support will allow you to take an existing large language model and use it as the basis for your own custom LLM without having to completely retrain it on your own data. RAG uses just-in-time engineering techniques to provide LLMs with a more comprehensive context to arrive at more accurate answers. With RAG, you can push more domain-specific or up-to-date data into the LLM, using external data sources, including your own business-specific information, as part of the prompt.
Adding the RAG tool to Windows AI Studio should help you build and test vector indexing and embedding of your data. Once you have these, you can start developing search-driven pipelines that will make your LLM application foundational and limit their responses to your own domain using tools like TypeChat, Prompt Flow, and Semantic Kernel.
However, for now, this early preview is focused on fine-tuning existing AI models, preparing them to convert to Onnx and embed WinML projects. It's worth using this feature on its own, as it's a key requirement for any custom machine learning product, and you want your model to run on local hardware, not in the cloud.
To set up a model tuning environment, first select a local folder, and then select a model. The initial selection is small, with five open-source text generation models available from Microsoft, Hugging Face, Mistral AI, and Meta. Here, Microsoft is using the Qlora tuning method: quantifying low-level adapters, a method developed by the University of Washington that has already shown impressive results. The original ** described a family of models that delivered ChatGPT 24 hours of tuning on a single GPU with just 99 hours of tuning3% performance.
If we're going to bring biointelligence to our computers and handheld devices, this is the approach we need. We don't need the complexity (or size) of a large language model;Instead, we need to perform the same performance on our own data in a small language model. Qlora and similar technologies are a way to build these custom AIs on top of open source base models.
Once you have selected the model, click on "Configure Project" to start setting up the project in Windows and WSL. Before using the model, you may need to enter an access token for "hugging your face" or register for access. Windows AI Studio provides you with a set of tuning parameters that you will use to optimize the performance of your model. For initial testing, simply accept the defaults and wait for the model to generate. You can also choose to use other datasets to improve tuning.
Once the model is generated, you will be prompted to restart the Visual Studio window in your Windows AI Studio workspace. This will switch you from Windows to WSL, ready to use the tools installed during the installation process. As part of the initial setup of the workspace, Windows AI Studio will install the Prompt Flow extension.
Once you have opened the model workspace, you can use the Visual Studio terminal to launch the Conda environment for tuning your model. You can now use Qlora to run Olive on default content or on your own datasets. This may take some time, so be prepared to wait. Even on relatively high-end graphics cards, adjustments can take several hours.
When the tuning process is complete, you can use a simple Gradio web interface to test your trained model before packaging it and using it in your application. It's an interesting little tool that is worth running before and after tuning so you can see how the process affects interactions.
It's important to remember that this is a complex tool that was released very early. Microsoft has done a lot of work on simplifying AI models and tuning tools, but you still need to know what you want from the language model you're building. There are a lot of variables that you can adjust as part of the turning process, and it's worth knowing what each one controls and how they affect the final model.
At the moment, Windows AI Studio is likely to be a tool for AI experts. However, it shows a lot of hope. As it grows, and more features are added, it's easy to become an important part of the Windows development workflow – especially if AI accelerators become a common component of next-generation PCs.