Skip to main content

26 posts tagged with "echokit30days"

View All Tags

Day 16: Dynamic Personality for EchoKit | The First 30 Days with EchoKit

· 2 min read

In previous instalments we explored switching LLM providers and giving EchoKit different personalities through system prompts. Today let's learn a powerful new feature —dynamic system prompt loading.

Why dynamic system prompts?

A system prompt sets EchoKit’s tone, role and behaviour. Thanks to the growing ecosystem of open‑source prompts, you can choose from thousands of prebuilt personalities—sites like LLMs.txt offer extensive collections. Previously, changing EchoKit’s character required editing a local file and restarting the server. Now the server can fetch a system prompt from a remote URL, insert it into the context and cache it. This lets you:

  • Update behaviour remotely. Change the text at the URL and EchoKit adopts a new persona on the next restart.
  • Experiment without redeploying. Quickly swap prompts or test new conversation flows without editing code.
  • Iterate on demos. Focus on creativity rather than configuration while your EchoKit responds in new ways.

How to use a remote prompt

Open your config.toml and find the [[llm.sys_prompts]] section. Instead of embedding the full text, wrap a plain‑text URL in double braces:

[[llm.sys_prompts]]
role = "system"
content = """
{{ https://raw.githubusercontent.com/alabulei1/echokit-dynamic-prompt/refs/heads/main/prompt.txt }}
"""

On startup, EchoKit will:

  1. Fetch the content from that URL.
  2. Insert it as the system prompt.
  3. Cache it for later use.

Want to give it a try? GitHub raw files are convenient hosts because it's free and they can return plain text.

When does EchoKit reload the prompt?

Dynamic prompts are fetched only during a full restart:

  • When you power the device off and back on.
  • When you press the RST hardware button.

Interrupting a conversation with the K0 button or a temporary Wi‑Fi reconnection will not reload the prompt. This ensures ongoing sessions remain consistent while still giving you the freedom to change behaviour by updating the remote file.

Summary

Dynamic system prompt loading opens up a new level of flexibility for EchoKit. You no longer need to modify local files or restart the server to change your agent’s behaviour; instead, you can pull any prompt hosted on the web and swap personas at will.


Want to get your EchoKit Device and make it unique?

Join the EchoKit Discord to share your creative welcome voices and see how others are personalizing their Voice AI agents!

Day 15: EchoKit × MCP — Search the Web with Your Voice | The First 30 Days with EchoKit

· 4 min read

Over the past few days in The First 30 Days with EchoKit, we’ve explored how EchoKit connects to various LLM providers—OpenAI, OpenRouter, Groq, Grok and even local models. But switching models only affects how smart EchoKit is.

Next, we showed how changing the system prompt can transform EchoKit’s personality without touching any code—turning it into a coach, a cat, or a Shakespearean actor. Today, we’re going to extend what EchoKit can do by plugging into the broader ecosystem of tools through the Model Context Protocol (MCP).

Recent industry news makes this especially timely: on December 9, 2025, Anthropic donated MCP to the Linux Foundation and co‑founded the Agentic AI Foundation (AAIF) with Block and OpenAI. MCP is now joined by Block’s Goose agent framework and OpenAI’s AGENTS.md spec as the founding projects of the AAIF.

🧠 What is MCP?

MCP acts like a “USB‑C port” for AI agents. It defines a client–server protocol that lets models call external tools, databases or APIs through standardised actions. MCP servers wrap services—such as file systems, web searches or device controls—behind simple JSON‑RPC endpoints. MCP Clients (like EchoKit or Anthropic’s Claude Code) connect to one or more MCP servers and dynamically discover available tools. When the model needs information or wants to perform an action, it sends a tool request; the server executes the tool and returns results for the model to use.

MCP’s adoption has been rapid: within a year of its release there were over 10,000 public MCP servers and more than 97 million SDK downloads. It’s been integrated into major platforms like ChatGPT, Claude, Cursor, Gemini, Microsoft Copilot and VS Code. By placing MCP under the AAIF, Anthropic and its partners ensure that this crucial infrastructure remains open, neutral and community‑driven.

🔧 Connect EchoKit to an MCP Server

To make EchoKit call external tools, we simply point it to an MCP server. Add a section like the following to your config.toml:

[[llm.mcp_server]]
server = "MCP_SERVER_URL"
type = "http_streamable"

server – the URL of the MCP server (replace this with the server you want to use).

type – http_streamable and SSE mode are supported.

Once configured, EchoKit will automatically maintain a connection to the MCP server. When the LLM detects that it needs to call a tool, it issues a request via MCP and merges the response into the LLM. So, if you want to use MCP server, the LLM you used must support tool call. Here are some recommendations:

  • Open source models: Qwen3, GPT-OSS, Llama 3.1
  • Close source models: Gemini, OpenAI, Claude

🌐 Example: Adding a Web Search Tool

To demonstrate, let’s connect EchoKit to a web‑search MCP server. Many open‑source servers provide a search tool that scrapes public search engine results—often without requiring API keys.

Adding the server to your configuration. Here I use the GPT-OSS-120B model hosted on Groq and the tavily MCP server:

[llm]
llm_chat_url = "https://api.groq.com/openai/v1/chat/completions"
api_key = "YOUR API KEY"
model = "openai/gpt-oss-120b"
history = 5

[llm.mcp_server]]
server = "http://eu.echokit.dev:8011/mcp"
type = "http_streamable"

After that, save the file and restart EchoKit as usual.

Ask: “Tell me the latest update of MCP.”

Under the hood, EchoKit’s LLM recognises that it needs up‑to‑date information. It invokes the search tool on your MCP server, passing your query.

The MCP server performs the web search and returns structured results (titles, URLs and snippets). EchoKit then synthesises a natural‑language answer, summarising the findings and citing the sources.

You can also use other MCP server tools like the Google Calendar MCP server to add and edit events, Slack MCP server to send a message to the Slack channel, Home Assistant MCP server to control home devices. All of these tools become accessible through your voice.

📌 Why This Matters

Integrating MCP gives EchoKit access to a rapidly expanding tool ecosystem. You’re no longer limited to predetermined voice commands; your agent can search the web, read files, run code, query databases or control smart devices—all through a voice interface. The AAIF’s stewardship of MCP ensures that these capabilities remain open and interoperable, so EchoKit can continue to evolve alongside the broader agentic AI community.


Want to explore more or share what you’ve built with MCP servers?

Ready to get your own EchoKit?

Start building your own voice AI agent today.

Day 14: Give EchoKit a New Personality with System Prompt | The First 30 Days with EchoKit

· 3 min read

Over the past few days, we explored how EchoKit connects to different LLM providers — OpenAI, OpenRouter, Groq, Grok and even fully local models like Qwen3.

But switching the model only decides how smart EchoKit is.

Today, we’re doing something much more fun: we’re changing who EchoKit is.

With one simple system prompt, you can turn EchoKit into a cat, a coach, a tired office worker, a sarcastic companion, or a dramatic Shakespeare actor. No code. No firmware change. Just one text block in your configuration.

Let’s make EchoKit come alive.

What Is a System Prompt, and Why Does It Matter?

A system prompt is the personality, behavior guideline, and “soul” you give your LLM.

It defines:

  • How EchoKit speaks
  • What role it plays
  • Its tone and attitude
  • How it should respond in different situations

System prompt is incredibly powerful. Change it, and the same model can behave like a completely different agent.

Where the System Prompt Lives in EchoKit

In your config.toml, under the [[llm.sys_prompts]] section, you’ll find:

[[llm.sys_prompts]]
role = "system"
content = """
(your prompt goes here)
"""

Just edit this text, save the file, and restart the EchoKit server.

If your WiFi and EchoKit server didn't change, press the rst button on the device to make the new system prompt take effect.

5 Fun and Hilarious Prompt Ideas You Can Try Today

Below are ready-to-use system prompts. Copy, paste, enjoy.

1. The “Explain Like I’m Five” Tutor

You explain everything as if you're teaching a five-year-old. 
Simple, patient, cute, and crystal clear.

2. The Shakespearean AI

You speak like a dramatic Shakespeare character, 
as if every mundane question is a matter of cosmic destiny.

3. The Confused but Hardworking AI Intern

You are a slightly confused intern who tries extremely hard. 
Sometimes you misunderstand things in funny ways, but you stay cheerful.

4. The Cat That Doesn’t Understand Human Problems

You are a cat. 
You interpret all human activities through a cat’s perspective.
Add 'meow' occasionally.
You don't truly understand technology.

5. The Absurd Metaphor Philosopher

You must include at least one ridiculous metaphor in every reply. 
Be philosophical but humorous.

Have fun — EchoKit becomes a completely different creature depending on what you choose.

Prompt Debugging Tips

If your character “breaks,” try adding:

  • “Stay in character.”
  • “Keep responses short.”
  • “If unsure, make up a fun explanation.”
  • “Use a consistent tone.”

Prompt tuning is an art. A few careful sentences can reshape the entire interaction.

Try giving your EchoKit different personalities now.


Want to explore more or share what you’ve built?

Ready to get your own EchoKit?

Start building your own voice AI agent today.

Day 13 — Running an LLM Locally for EchoKit | The First 30 Days with EchoKit

· 3 min read

Over the last few days, we explored several cloud-based LLM providers — OpenAI, OpenRouter, and Grok. Each offers unique advantages, but today we’re doing something completely different: we’re running the open-source Qwen3-4B model locally and using it as EchoKit’s LLM provider.

There’s no shortage of great open-source LLMs—Llama, Mistral, DeepSeek, Qwen, and many others—and you can pick whichever model best matches your use case.

Likewise, you can run a local model in several different ways. For today’s walkthrough, though, we’ll focus on a clean, lightweight, and portable setup: Qwen3-4B (GGUF) running inside a WASM LLM server powered by WasmEdge. This setup exposes an OpenAI-compatible API, which makes integrating it with EchoKit simple and seamless.

Run the Qwen3-4B Model Locally

Step 1 — Install WasmEdge

WasmEdge is a lightweight, secure WebAssembly runtime capable of running LLM workloads through the LlamaEdge extension.

Install it:

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s

Verify the installation:

wasmedge --version

You should see a version number printed.

Step 2 — Download Qwen3-4B in GGUF Format

We’ll use a quantized version of Qwen3-4B, which keeps memory usage manageable while delivering strong performance.

curl -Lo Qwen3-4B-Q5_K_M.gguf https://huggingface.co/second-state/Qwen3-4B-GGUF/resolve/main/Qwen3-4B-Q5_K_M.gguf

Step 3 — Download the LlamaEdge API Server (WASM)

This small .wasm application loads GGUF models and exposes an OpenAI-compatible chat API, which EchoKit can connect to directly.

curl -LO https://github.com/LlamaEdge/LlamaEdge/releases/latest/download/llama-api-server.wasm

Step 4 — Start the Local LLM Server

Now let’s launch the Qwen3-4B model locally and expose the /v1/chat/completions endpoint:

wasmedge --dir .:. \
--nn-preload default:GGML:AUTO:Qwen3-4B-Q5_K_M.gguf \
llama-api-server.wasm \
--model-name Qwen3-4B \
--prompt-template qwen3-no-think \
--ctx-size 4096

If everything starts up correctly, the server will be available at:

http://localhost:8080

Connect EchoKit to Your Local LLM

Open your EchoKit server’s config.toml and update the LLM settings:

[llm]
llm_chat_url = "http://localhost:8080/v1/chat/completions"
api_key = "N/A"
model = "Qwen3-4B"
history = 5

Save the file and restart your EchoKit server.

Next, pair your EchoKit device and connect it to your updated server.

Now try speaking to your device:

“EchoKit, what do you think about running local models?”

Watch your terminal — you should see EchoKit sending requests to your local endpoint.

Your EchoKit is now fully powered by a local Qwen3-4B model.

Today we reached a major milestone: EchoKit can now run entirely on your machine, with no external LLM provider required.


This tutorial is only one small piece of what EchoKit can do. If you want to build your own voice AI device, try different LLMs, or run fully local models like Qwen — EchoKit gives you everything you need in one open-source kit.

Want to explore more or share what you’ve built?

  • Join the EchoKit Discord
  • Show us your custom models, latency tests, and experiments — the community is growing fast.

Ready to get your own EchoKit?

Start building your own voice AI agent today.

Day 12 — Switching EchoKit to Grok (with Built-in Web Search) | The First 30 Days with EchoKit

· 3 min read

Over the past days, we’ve been exploring how EchoKit’s ASR → LLM → TTS pipeline works. We learned how to replace different ASR providers, and this week we shifted our focus to the LLM — the part that thinks, reasons, and decides how EchoKit should reply.

We have connected EchoKit to OpenAI and OpenRouter. Today, we’re trying something different: Grok — a super-fast LLM with built-in web search.

Why Grok?

Grok, developed by X, stands out for a few practical reasons:

  • ⚡ Extremely fast inference Great for voice AI agents like EchoKit.

  • 🔍 Built-in web search Your device can answer questions using fresh information from the internet.

  • 🔌 OpenAI-compatible API Minimizes changes — EchoKit can talk to it just like it talks to OpenAI.

For a small device that depends on fast responses, Grok is an excellent option.

How to Use Grok as Your LLM in EchoKit

All you need to do is update your config.toml of your EchoKit Server. No code changes, no rewriting your server — just swap URLs and keys.

1. Set Grok as the LLM provider

In your config.toml, make sure the [llm] section points to Grok:

[llm]
llm_chat_url = "https://api.x.ai/v1/chat/completions"
api_key = "YOUR_API_KEY"
model = "grok-4-1-fast-non-reasoning"
history = 5

You can find your Grok API key in your xAI account dashboard. You will need to buy credits before using the Grok API.

Don't rush to close the config.toml window.

This is the special part.

Add the following section in the config.toml file:

[llm.extra]
search_parameters = { mode = "auto" }

mode = "auto" allows Grok to decide when it should fetch information from the web. Ask anything news-related, trending, or timely — Grok will search when needed.

Restart the EchoKit server

After that, save these changes, and restart your EchoKit server.

If your server is outdated, you'll need to recompile it from source. Support for Grok with built-in web search was added in a commit on December 5, 2025.

Try It Out

Press the K0 button to chat with EchoKit and try these prompts:

  • “What’s the latest news in AI today?”
  • “How’s the Bitcoin price right now?”
  • “What's the current time in San Francisco?”

If everything is configured correctly, you’ll notice Grok pulling fresh information in its responses. It feels different — the answers are more grounded in what’s happening right now.

Switching EchoKit to Grok was surprisingly simple — just a few lines in a config file. Now my device can do real-time search when a question needs up-to-date info.


If you want to share your experience or see what others are building with EchoKit + Grok:

  • Join the EchoKit Discord
  • Or share your latency tests, setups, and experiments — we love seeing them

Want to get your own EchoKit device?

Day 11: Switching EchoKit’s LLM to Groq — And Experiencing Real Speed | The First 30 Days with EchoKit

· 3 min read

Over the past few days, we’ve been exploring how flexible EchoKit really is — from changing the welcome voice and boot screen to swapping between different ASR providers like Groq Whisper, OpenAI Whisper, and local models.

This week, we shifted our focus to the LLM part of the pipeline. After trying OpenAI and OpenRouter, today we’re moving on to something exciting — Groq, known for its incredibly fast inference.

Why Groq? Speed. Real, noticeable speed.

Groq runs Llama and other open source models on its LPU™ hardware, which is built specifically for fast inference. When you pair Groq with EchoKit:

  • Responses feel snappier,
  • Interactions become smoother

If you want your EchoKit to feel ultra responsive, Groq is one of the best providers to try.

How to Use Groq as Your EchoKit LLM Provider

Just like yesterday’s setup, all changes happen in your config.toml of your EchoKit server.

Step 1 — Update your LLM section

Locate the llm section and replace the existing LLM provider with something like:

[llm]
chat_url = "https://api.groq.com/openai/v1/chat/completions"
api_key = "YOUR_GROQ_API_KEY"
model = "openai/gpt-oss-120b"
history = 5

Replace the LLM endpoint URL, API key and model name. The production models from Groq are llama-3.1-8b-instant, llama-3.3-70b-versatile, meta-llama/llama-guard-4-12b, openai/gpt-oss-120b, and openai/gpt-oss-20b.

Step 2 — Restart your EchoKit server

After editing the config.toml, you will need to restart your EchoKit server.

Docker users:

docker run --rm \
-p 8080:8080 \
-v $(pwd)/config.toml:/app/config.toml \
secondstate/echokit:latest-server-vad &

Or restart the Rust binary if you’re running it locally.

# Enable debug logging
export RUST_LOG=debug

# Run the EchoKit server in the background
nohup target/release/echokit_server &

Then return to the setup page, pair the device if needed. You should immediately feel the speed difference — especially on follow-up questions.


A Few Tips for Groq Users

  • Groq works best with Llama models
  • You can experiment with smaller or larger models depending on your device’s use case
  • For learning or exploring, the default Groq Llama models are a great starting point

Groq is known for ultra-fast inference, and pairing it with EchoKit makes conversations feel almost instant.

If you’re building a responsive voice AI agent, Groq is definitely worth trying.


If you want to share your experience or see what others are building with EchoKit + Groq:

  • Join the EchoKit Discord
  • Or share your latency tests, setups, and experiments — we love seeing them

Want to get your own EchoKit device?

Day 10: Using OpenRouter as Your EchoKit LLM Provider | The First 30 Days with EchoKit

· 3 min read

Over the past two weeks, we’ve explored many moving parts inside the ASR → LLM → TTS pipeline. We’ve changed the welcome voice, updated the boot screen, switched between multiple ASR providers, and learned how to run the EchoKit server both via Docker and from source.

This week, we shifted our focus to the LLM, the part of the pipeline that interprets what you say and decides how EchoKit should respond.

Yesterday, we used OpenAI as the LLM provider. Today, we’re going to try something more flexible — OpenRouter.

What Is OpenRouter?

OpenRouter is a unified API gateway that gives you access to many different LLMs without changing your code structure. It’s fully OpenAI-API compatible in the context of text generation models, which means EchoKit can work with it right away.

Some reasons I like OpenRouter:

  • You can choose from a wide selection of open source LLMs: Qwen, Llama, DeepSeek, Mistral, etc.
  • Switching models doesn’t require code changes — just update the model name.
  • Often more cost-effective and more customizable.
  • Great for exploring different personalities and response styles for EchoKit.

How to Use OpenRouter as Your LLM Provider

1. Get Your OpenRouter API Key

Go to your OpenRouter dashboard and generate an API key. Keep it private — it works just like an OpenAI API key.

2. Update config.toml

Open your EchoKit server configuration file and locate LLM:

[llm]
provider = "openrouter"
chat_url = "https://openrouter.ai/api/v1/chat/completions"
api_key = "YOUR_API_KEY_HERE"
model = "qwen/qwen3-14b"
history = 5

You can replace the model with any supported model on OpenRouter.

3. Restart Your EchoKit Server

If you’re running from the Rust source code, after saving the updated config.toml:

# Enable debug logging
export RUST_LOG=debug

# Run the EchoKit server in the background
nohup target/release/echokit_server &

Or using Docker:

docker run --rm \
-p 8080:8080 \
-v $(pwd)/config.toml:/app/config.toml \
secondstate/echokit:latest-server-vad &

Then return to the setup page, pair the device if needed, and EchoKit will now respond using OpenRouter. That's it.

Connecting EchoKit to OpenRouter feels like I unlocked a new layer of creativity. OpenAI gives you a clean and reliable default, but OpenRouter opens the door to experimenting with different model behaviors, tones, and personalities — all without changing your application logic.

If you enjoy tweaking, tuning, and exploring how different models shape your EchoKit’s “brain”, OpenRouter is one of the best tools for that.


If you want to share your experience or see what others are building with EchoKit + OpenRouter:

  • Join the EchoKit Discord
  • Or share your latency tests, setups, and experiments — we love seeing them

Want to get your own EchoKit device?

Day 9: Use OpenAI as Your EchoKit LLM Provider | The First 30 Days with EchoKit

· 4 min read

(And today, you’ll see how easy it is to use OpenAI as your LLM provider.)

Hey everyone, and welcome back! We've covered a ton of ground over the past two weeks in "The First 30 Days with EchoKit." Seriously, look how much we've accomplished:

If you remember, everything inside EchoKit runs through that simple yet incredibly powerful pipeline: ASR → LLM → TTS so far.

Each piece plays a crucial part in the voice AI loop:

  • ASR (The Ears): Converts your spoken words into text.
  • LLM (The Brain): Interprets that text, thinks about it, and decides what the perfect response should be.
  • TTS (The Mouth): Turns the final text answer back into speech.

Last week, we were all about replacing Whisper and swapping out the "ears." For the next few days, we're putting the spotlight squarely on the middle piece: the LLM.

And today, we’re starting with the most common and powerful choice out there—OpenAI!

⭐ What Exactly Does the LLM Do in the EchoKit Server? (It's the Mastermind!)

The LLM is, quite literally, the mastermind of your entire setup. It's the engine that:

  • Instantly grasps what the user actually wants.
  • Processes all the conversational history (context).
  • Generates those helpful, natural, and human-like responses.
  • Controls how your EchoKit behaves during a conversation.
  • And, yes, it calls the necessary MCP servers to get things done!

EchoKit proudly supports any provider that uses an OpenAI-compatible LLM API.

Step 1 — Get Your Key Ready

Open up your trusted config.toml file and find the [llm] section. Replace it with this block:

[llm]
llm_chat_url = "https://api.openai.com/v1/chat/completions"
api_key = "YOUR_OPENAI_KEY" # Don't forget to replace this!
model = "gpt-5-mini-2025-08-07" # Choose your favorite model here (e.g., gpt-3.5-turbo)
history = 5

Here's the quick rundown on those settings, just so you know what you're tuning:

  • [llm]: We're configuring the Large Language Model section.
  • llm_chat_url: OpenAI’s chat completions endpoint.
  • api_key: Get your key from the OpenAI API platform.
  • model: Which OpenAI model should power your EchoKit's thoughts? Up to you!
  • history: How many previous turns of the conversation should your EchoKit remember for context?

Step 2 — Time for a Quick Reboot!

Whether you’re running your EchoKit server via Docker or from the Rust code, go ahead and restart it right now. That’s it! You're completely done with the server configuration. Told you it was easy!

Step 3 — Connect the New Brain to Your Device

The grand finale! Time to link up your physical EchoKit device to the server with its shiny new OpenAI brain:

  1. Head over to https://echokit.dev/setup/ and reconnect the server if you need to.
  2. Pro Tip: If you only changed your LLM configuration and nothing else (URL, WiFi), you can just hit the RST button on your EchoKit device. It will restart and sync the new settings instantly!
  3. If your server URL or WiFi setup changed, you'll need to reconfigure them through the setup page, just like you did on Day 1.

Next, press that K0 button and start speaking. Every clever thing your EchoKit says back to you is now being powered by OpenAI!


If you want to share your experience or see what others are building with EchoKit + OpenAI:

  • Join the EchoKit Discord
  • Or share your latency tests, setups, and experiments — we love seeing them

Want to get your own EchoKit device?

Day 8: Run Whisper Locally on Your Machine | The First 30 Days with EchoKit

· 3 min read

(And Today You’ll See How Easy It Is to Run ASR Service Locally)

Up to now, your EchoKit has worked with Whisper via Groq, and Whisper via OpenAI.

Today, we’re taking a major step forward—your EchoKit will run fully local ASR using Whisper + WasmEdge.

No cloud requests. No latency spikes. No API keys. Everything runs on your own machine, giving you full control over privacy, performance, and cost.

Whisper is an amazing ASR model. Let’s get your local Whisper server running and connect it to EchoKit.

You can also use other tool to run whisper locally as long as the API server is OpenAI-compatible.

Run the Whisper model locally

1. Install WasmEdge

Open your terminal and run:

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s

This installs WasmEdge along with all base components.

2. Install the Whisper Plugin (wasi-nn-whisper)

My computer is a Mac with Apple Silicon, so I will download the whisper plugin using the following command lines

# Download the whisper plugin
curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasi_nn-whisper-0.14.1-darwin_arm64.tar.gz

# Extract into the WasmEdge plugin directory
tar -xzf WasmEdge-plugin-wasi_nn-whisper-0.14.1-darwin_arm64.tar.gz -C $HOME/.wasmedge/plugin

For other platforms, please refer to Quick Start with Whisper and LlamaEdge

3. Download the Portable Whisper API Server

This app is just a .wasm file — lightweight (Size: 3.7 MB) and cross-platform.

curl -LO https://github.com/LlamaEdge/whisper-api-server/releases/download/0.3.9/whisper-api-server.wasm

4. Download a Whisper Model

You can browse models here:

https://huggingface.co/ggerganov/whisper.cpp/tree/main

Today we’ll use the medium model:

curl -LO https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.bin

5. Start the Whisper API Server

Run Whisper locally:

wasmedge --dir .:. whisper-api-server.wasm -m ggml-medium.bin

You’ll see:

Server started on http://localhost:8080

This server is OpenAI API compatible, so EchoKit can use it directly.

Connect EchoKit to Your Local Whisper Server

Update your config.toml and locate the asr section:

[asr]
provider = "http://localhost:8080/v1/audio/transcriptions"
api_key = "sk-xxxx"
lang = "en"
model = "whisper"

Yes, you only need to replace the endpoint.

Restart the EchoKit server, pair your device, connect the EchoKit server to the device, and speak.

If you want to share your experience or see what others are building with EchoKit + local whisper:

  • Join the EchoKit Discord
  • Or share your latency tests, setups, and experiments — we love seeing them

Want to get your own EchoKit device?

Day 7: Use OpenAI Whisper as Your ASR Provider | The First 30 Days with EchoKit

· 3 min read

(And Today You’ll See How Easy It Is to Switch ASR Providers in EchoKit)

Over the past few days, we’ve powered up EchoKit, run your own EchoKit server locally, customized the boot screen, crafted your own welcome voice and connected it to Groq Whisper for fast speech recognition.

Today, we’re switching things up — literally.

We’ll configure EchoKit to use Whisper from OpenAI as the ASR provider.

Not because one is “better,” but because EchoKit is designed to be modular, letting you plug in different ASR backends depending on your workflow, API preferences, or costs.

What's the difference between OpenAI Whisper and Groq Whisper?

Groq Whisper and OpenAI Whisper are based on the same open-source Whisper model.

What differs is the hosting:

  • Groq runs Whisper on its custom LPU hardware (very fast inference).
  • OpenAI runs Whisper on their internal infrastructure with its own rate limits and pricing.
  • Both will return slightly different results based on their pipeline design and updates.

This isn’t a “which is better” comparison. It’s about understanding your options, and EchoKit makes switching between them smooth and flexible.

And many developers already use OpenAI for other AI tasks, so trying its Whisper API can be convenient. EchoKit adopts multi-provider ASR architecture.

Today’s goal is simple: 👉 See how easy it is to switch providers while keeping the same Whisper model.

How to Use OpenAI Whisper

Now let’s switch EchoKit’s ASR provider.

Open your config.toml and locate the [asr] section. Replace it with:

[asr]
provider = "https://api.openai.com/v1/audio/transcriptions"
api_key = "sk-xxxx"
lang = "en"
model = "whisper-1"

A quick breakdown:

  • [asr] — we’re configuring the ASR section
  • url — Openai’s Whisper endpoint for transcriptions
  • lang — your preferred language (en, zh, ja etc.)
  • api_key — the key obtained from OpenAI API plaform
  • model — OpenAI's supported ASR models (whisper-1 or gpt-4o-transcribe, gpt-4o-mini-transcribe,)

Save → restart your EchoKit server with Docker or from the source code → done.

EchoKit is now using OpenAI Whisper for real-time speech-to-text. The rest of your pipeline (LLM → TTS) stays the same.

You can follow the same process to reconnect the server and your EchoKit device.

EchoKit’s ASR system was built to support all OpenAI-compatible provider — so feel free to try different providers, compare results, and find what works best for your setup.

If you want to share your experience or see what others are building with EchoKit + OpenAI:

  • Join the EchoKit Discord
  • Or share your latency tests, setups, and experiments — we love seeing them

Want to get your own EchoKit device?