Microsoft's AI Toolkit for VS Code Explained In 5 Minutes

This article explores an extension for Visual Studio Code called the AI Toolkit, developed by Microsoft. It provides a powerful environment for experimenting with various AI models from providers like OpenAI, Anthropic, and others. The toolkit allows you to work with models hosted on GitHub, within your own Azure AI environment, or even local models running on your machine. This flexibility enables you to experiment with a diverse range of model types.

Key Features at a Glance

The AI Toolkit comes packed with several powerful features:

  • Playground: A space to compare the outputs of different models for a given prompt, helping you assess their performance side-by-side.
  • Agent Builder: Enables the creation of AI agents by defining a system prompt and integrating various tools, including MCP (Microsoft Connected Platform) tools.
  • Bulk Run: Allows you to test a prompt across multiple models simultaneously, streamlining the evaluation process.
  • Model Evaluation: Provides metrics to find out which models are performing best for your specific use cases.
  • Fine-Tuning & Conversion: Offers capabilities for fine-tuning models and performing model conversions.

In this guide, we will walk through practical examples of the Playground and Agent Builder features, utilizing models from both a GitHub account and an Azure AI Foundry account.

Setting Up Your Models

Within the AI Toolkit interface, you can connect to various model sources. For this demonstration, multiple models were loaded by connecting to both a GitHub account and Azure AI Foundry.

The toolkit's catalog allows you to enable models from different publishers, including: - GitHub - Azure AI Foundry - Local instances

For instance, after adding the GPT-4 model from GitHub, it appears in your list of available models. Similarly, you can add others, and they will be reflected in the interface. To use models from Azure AI Foundry, you simply connect your account, and any deployed models become accessible.

Hands-On Demo: The Playground

One of the standout features is the Playground, which is excellent for comparing model outputs. You can select two or more models to see how they respond to the same prompt.

Let's compare GPT-4 via GitHub with GPT-4 hosted on Azure AI Foundry. We'll use the following prompt:

write a fast API template with authentication

Upon sending the prompt, the GPT-4 model on Azure AI Foundry generated a response detailing the necessary packages to install and the required code structure. The GPT-4 model from GitHub produced a similar result. The Playground interface allows you to easily review and select the most suitable answer for your needs.

Building an Agent with Agent Builder

The Agent Builder is another powerful feature that lets you create a sophisticated agent by providing a system prompt and a set of tools.

Note: A great aspect of the toolkit is its transparency. When you interact with models in the Playground or Agent Builder, you can also view the underlying code. To get started, you typically need to install the Azure AI Inference library, which facilitates connections to models from GitHub, Azure OpenAI, and other sources.

Let's explore a pre-built 'Web Scraper' agent use case. The goal is to have the agent visit a website and summarize its content.

First, we select a model—in this case, GPT-4o from Azure. The agent is configured with a system prompt that defines its role.

Here is the system prompt: You are a web exploration assistant.

Next, we provide a user prompt with a specific task. We'll modify the default to target a different site: Go to CNN.com and perform the following task: give a summary.

This setup uses a template connected to an MCP server for Playwright, providing the necessary web-scraping tools. With the model, system prompt, and user prompt configured, we can run the agent.

The initial attempt with GPT-4o resulted in a 'too many requests' error, likely because the homepage content of CNN.com was too large for the model's context window.

To resolve this, we can switch to a different model. Let's try GPT-4-mini. Running the agent again with this model successfully scrapes the site and generates a summary. The output provided a concise 'Homepage Summary' as requested. This demonstrates the practical application of the Agent Builder and the importance of model selection.

From Agent to Application: Code Generation

As mentioned, the toolkit can generate the necessary code to run your agent independently. You can choose between the Azure AI Inference SDK and the Semantic Kernel SDK.

Selecting the Azure AI Inference SDK provides a Python script. This script includes all the logic for connecting to the model and the MCP server. You simply need to provide your API keys (preferably as environment variables), and you can deploy this agent anywhere. The generated code is a complete, runnable application.

Conclusion

The AI Toolkit for Visual Studio Code is a comprehensive tool for developers working with large language models. From comparing models in the Playground to building and deploying complex agents with the Agent Builder, it streamlines numerous AI development tasks. Future explorations for this publication could delve into its model evaluation and fine-tuning capabilities.