A Guide to Assigning Specific Claude Models to Your AI Agents |

In this article, we'll explore how you can now assign different types of Claude models to your agents. This significant update could fundamentally change how you manage and configure agents, allowing for optimized setups that save both time and tokens.

Within an agent's configuration, you can now access an "Edit Model" setting. This presents several options to choose from, including Haiku, Opus, and Sonnet.

This article details a few tests conducted with agents built using these three models to showcase the practical differences and potential applications.

A Simple Test: The Dice Roller Agent

To begin, let's conduct a straightforward test to observe the difference in speed and performance between the models. We'll start in a blank directory and create a new project-based agent.

The agent's description is simple: This agent should roll a dice numerous times and return the results.

Upon creation, Claude generates the necessary configuration file for the sub-agent, which follows a specific format including name, description, and other parameters. For this test, no external tools are required.

The agent creation process now includes a new step for model selection. While all models are available, the introduction of Haiku is notable. It's described as a fast and efficient model for simple tasks, which should be perfect for our dice-rolling agent.

To establish a baseline, we'll conduct the test with each model sequentially: Opus, then Sonnet, and finally Haiku.

Comparing Model Performance

With the "Dice Roller" agent created, we can see which model is assigned to it directly in the agent management interface. The agent's configuration is stored in a local file, which will remain unchanged throughout the tests to ensure consistency.

The primary agent is triggered to delegate the task by a simple instruction: "Run the dice roller agent."

Here are the results of running the same task with each model:

Opus: The task completed in approximately 1 minute and 5 seconds. The model approached the task by generating and executing a Python script.
Sonnet: The task was significantly faster, finishing in just 17 seconds.
Haiku: This model was the fastest, completing the task in a remarkable 7.5 seconds.

While not a rigorous scientific benchmark, this simple test clearly demonstrates a substantial performance difference between the models for a straightforward task. The most interesting observation was that Opus chose a completely different, more complex method by writing and executing a script, whereas Sonnet and Haiku likely used a more direct approach.

A More Complex Workflow: Building a Bitcoin Price Tracker

To explore a more practical scenario, let's set up a multi-agent system to build a simple web application. This system will consist of three distinct agents, each assigned a model best suited for its role:

The Planner Agent (Opus): Responsible for planning the project and breaking it down into sprints. Opus is chosen for its powerful reasoning and planning capabilities.
The Execution Agent (Sonnet): Tasked with executing the instructions from the planner. Sonnet provides a good balance of performance and capability for coding tasks.
The Documentation Agent (Haiku): Responsible for writing the README and other documentation. Haiku is selected for its speed and efficiency in handling simpler text-generation tasks.

With the agents configured, we can issue a single prompt to orchestrate the entire workflow.

Note on the process: The goal is to create a simple HTML application that uses the CoinGecko API to fetch the current Bitcoin price and display it within a dark-themed interface. The agents will be chained: the Planner creates a plan, the Execution agent builds the app, and the Documentation agent writes the docs.

The process begins by delegating the task to the Planner agent (Opus). A key observation is how context is passed between agents. Initially, the main agent's context window was minimal (around 27 tokens). After the Planner agent completed its work, it passed over 4,500 tokens of context to the next agent in the chain, though it generated over 14,000 tokens in total to create the plan.

The system then seamlessly chained to the Executor agent (Sonnet), which used the provided sprint plan to build the Bitcoin price tracker application.

The Final Application

The Sonnet-powered Execution agent successfully built the application. Here is the code it generated:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Bitcoin Price Tracker</title>
    <style>
        body {
            background-color: #121212;
            color: #e0e0e0;
            font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif;
            display: flex;
            justify-content: center;
            align-items: center;
            height: 100vh;
            margin: 0;
            text-align: center;
        }
        .container {
            border: 1px solid #333;
            padding: 40px;
            border-radius: 10px;
            background-color: #1e1e1e;
            box-shadow: 0 4px 20px rgba(0, 0, 0, 0.5);
        }
        h1 {
            color: #f7931a;
            margin-bottom: 20px;
        }
        #price {
            font-size: 2.5em;
            margin: 20px 0;
        }
        #timestamp, #last_24h {
            font-size: 0.9em;
            color: #888;
        }
    </style>
</head>
<body>
    <div class="container">
        <h1>Bitcoin Price Tracker</h1>
        <div id="price">Loading...</div>
        <div id="timestamp"></div>
        <div id="last_24h"></div>
    </div>

    <script>
        async function fetchBitcoinPrice() {
            try {
                const response = await fetch('https://api.coingecko.com/api/v3/simple/price?ids=bitcoin&vs_currencies=usd&include_24hr_change=true');
                const data = await response.json();
                const bitcoinData = data.bitcoin;
                const price = bitcoinData.usd;
                const change24h = bitcoinData.usd_24h_change;

                document.getElementById('price').innerText = `$${price.toLocaleString()}`;
                document.getElementById('timestamp').innerText = `Last updated: ${new Date().toLocaleTimeString()}`;
                document.getElementById('last_24h').innerText = `24h Change: ${change24h.toFixed(2)}%`;

            } catch (error) {
                document.getElementById('price').innerText = 'Error fetching price';
                console.error('Error:', error);
            }
        }

        fetchBitcoinPrice();
        setInterval(fetchBitcoinPrice, 60000); // Auto-refresh every 60 seconds
    </script>
</body>
</html>

Generating Documentation with Haiku

After the application was built, the workflow automatically triggered the Documentation agent. Using the project's context, the Haiku model quickly generated a README.md file. Using the powerful Opus model for this simple task would have been wasteful. Haiku proved to be a perfect fit.

Here is the documentation it produced:

README.md

Project Overview

This project is a simple, real-time Bitcoin price tracker web application. It fetches the current price of Bitcoin in USD from the CoinGecko API and displays it on a clean, dark-themed webpage. The application also shows the percentage change in the last 24 hours and includes an auto-refresh feature.

Features

Real-Time Price: Displays the current price of Bitcoin.
24-Hour Change: Shows the price fluctuation over the last 24 hours.
Auto-Refresh: The price automatically updates every minute.
Dark Theme: A modern and easy-on-the-eyes user interface.

Conclusion

This multi-agent workflow was a success. We used Opus for high-level planning, Sonnet for the core development work, and Haiku for fast and efficient documentation.

This experiment demonstrates a powerful new way to think about structuring AI agent workflows. By carefully assigning the right model to each agent based on the complexity of its task, you can create systems that are not only more efficient but also more cost-effective. This is a feature worth exploring further to leverage its full potential in saving both time and tokens.