Automating WhatsApp Support with AI, MCP, and 11 Labs Voice
There are a lot of businesses using WhatsApp and many local stores also offer customer support through it. Most of them rely on automated messages. Even with the business version of WhatsApp, these messages don't really help much. AI has improved a lot; now we have voice modules and language models that can read and understand your responses. This removes the need for a human customer support agent. Normally, WhatsApp gives you an option to call a real person, but that can be avoided with the new WhatsApp and MCP integration with 11 Labs.
We already have a WhatsApp MCP. In this article, I'll show you how to install it. It's a solid service that connects with any MCP client. You can receive and send messages easily. The best part is that it's fully local; it doesn't send your data to any server. It runs on your own machine. I'll walk you through the setup.
11 Labs has also released their own MCP. You can now send requests to their models. In my opinion, these are the best models for audio and text-to-speech transcription. Nothing else sounds as real as 11 Labs. They are a bit expensive, so keep that in mind.
Setup and Prerequisites
Let's move on to the setup. To use MCP servers, you need an MCP client. The most common ones right now are Claude, Cursor, and Windsurf. I'll be using Claude Desktop for this setup. Cursor and Windsurf are coding IDEs which don't really fit this use case. You can use any client you prefer. There are also new OpenMCP clients showing up; these can work too if you want to try them. It depends on the agent used in the MCP client. For this guide, I'm using Claude Desktop.
I've installed three MCP servers named Blender, WhatsApp, and 11 Labs. WhatsApp and 11 Labs are already set up.
Note: Both the 11 Labs MCP and the WhatsApp MCP require uvx
to run. Make sure the UV package manager is installed on your system.
Configuration
To configure them, just click edit config
and open the file. I won't open it here because it has API keys. Open the file, paste the keys, and you'll be ready for the next steps.
Once that's done, go to the GitHub repo. First, copy the template from the repo and paste it into your Claude configuration. Then, copy the next command and paste it into your terminal. This will clone the repository and move into the cloned folder.
git clone <repository_url> && cd <repository_folder>
Next, copy another command and run it. It will go into the WhatsApp bridge and execute the main.go
file.
cd whatsapp-bridge && go run main.go
When you run it, you'll get a QR code. Scan it with your mobile to connect, the same way you would with WhatsApp Web or WhatsApp Desktop. This links your WhatsApp to the client and allows the MCP server to communicate with it.
Note: You also need the Go programming language installed on your system for this to work.
Once that's ready, you need to configure the setup. Run the command with UV. It's simple, just type which uv
in any directory, and it will return the path to UV. Paste that path where needed, along with the path to your working directory. Use the pwd
command to get the directory path. You'll see the WhatsApp MCP name is repeated; only write it once.
When all this is done, go back to Claude. It will now be able to access and respond to your WhatsApp chats.
11 Labs MCP Setup
You will need to install the 11 Labs MCP as well. It only requires one environment variable, which is the 11 Labs API key. If you have a paid 11 Labs account, you can get the key directly from your account settings. Once you have it, paste it where needed. Then, copy the entire block and place it in your MCP config for Claude. If you want to integrate it into Cursor or Windsurf instead, the process is the same. Cursor now uses a JSON file for MCP configuration instead of the command method. All three main MCP clients follow this format now.
Windows Compatibility
Another important thing is Windows compatibility. For those who use Windows, the 11 Labs MCP works the same without any changes. But for the WhatsApp MCP, Windows users need to enable CGO. Just run one extra command and make sure you have a C compiler installed for Windows. Do these two steps, and it will work on Windows as well.
Testing the Integration
Now that both MCP servers are added, it's time to test them. I'll ask it to get the latest messages from my chats. Since my dad has two contacts, I'll specify which one. The contact I'm waiting for is labeled as "dad". Allow the request, and it will start querying the messages.
You can see that we got a reply from the chat. Now we can respond with a message saying "where it worked".
Next, I'll tell it to send a voice message using 11 Labs. You can see it's using the tool from 11 Labs, and this is the text it's going to send. It has successfully sent a message in Brian's voice from 11 Labs to my dad on WhatsApp.
Let's take a look. You can see that I've opened the chat in WhatsApp and received a message from my dad. I replied to it using the 11 Labs MCP. When I play the message, you can see that it works. The message says:
"I am at work"
The voice does feel a bit off since it's not our own and sounds a little unnatural, but there is something you can do about that.
Improving Authenticity with Voice Cloning
11 Labs also offers voice cloning, which is actually very realistic. You can create your own voice clone. There are two types available. You can check them out and then prompt the MCP to use your voice clone. This way, everything is automated and sounds more authentic.
Advanced Usage and Future Possibilities
If you want to use an open MCP client, that's possible too. There's a hosted version, and you can also run it locally. If you want to build a proper tool or agent, you'll need to create your own setup. You would have to build your agents and assign each tool individually. There's a lot of potential to create amazing things. I'll be exploring this further. I also have a Swift app in mind where you can set up each chat as a business with separate parameters. This would allow you to run multiple businesses on WhatsApp independently and add a voice assistant to each. That's going to be pretty cool.