Skip to main content

3 posts tagged with "voice-ai"

View All Tags

My Coding Assistant Lives in a Box Now | EchoKit

· 7 min read

It was 2 AM. I was deep in a coding session, fingers flying across the keyboard, completely in the zone. Then I hit a bug. I needed to run the tests.

Which meant breaking my flow. Switching windows. Typing the command. Waiting. Switching back.

I thought: What if I could just say it?

Not into my phone. Not unlocking an app. Just speak—to a device sitting on my desk.

A Small Device, Big Idea

That moment sparked an experiment. What if my AI coding assistant wasn't trapped in a terminal window, but lived in a small device on my desk? What if I could speak to it like a pair programmer sitting next to me?

Not voice typing—I hate that. But voice commands. Like having a junior developer who actually does things, not just suggests them.

So I built it.

Today, I'm excited to share how EchoKit became a voice remote control for Claude Code. And why this changes everything about how I work.

It Started with a Problem

Claude Code is amazing. It writes code, fixes bugs, runs tests, explains errors.

Yes, Claude Code now has an official Remote Control feature for mobile and web access. But it's designed for phones and browsers—not for hands-free voice control or physical devices. You still need to look at a screen and tap buttons.

I wanted something different. Something that felt like... magic.

The Missing Piece

I had EchoKit—my open-source voice AI device sitting on my desk. It can hear me, think, and respond. But it couldn't control my code editor.

I needed a bridge.

That bridge is called echokit_pty.

What is echokit_pty? It's the web version of Claude Code, but with a superpower: a WebSocket interface.

See, Claude Code was designed as a CLI tool. You run it in your terminal, type commands, get responses. That's great for terminal workflows. But for voice control? For remote access? For building anything on top of Claude Code?

You need something more.

echokit_pty is that "more."

How echokit_pty Changed Everything

Here's what echokit_pty does: it takes Claude Code and exposes it through a WebSocket server. Suddenly, Claude Code isn't just a terminal app—it's a service that anything can talk to.

My EchoKit device can send commands. A web app could send commands. A mobile app. A game controller. Anything that speaks WebSocket.

But here's the beautiful part: it's still Claude Code. All the capabilities, all the intelligence, everything that makes Claude Code amazing—just accessible through a clean, simple interface.

The Setup: Three Pieces, One Experience

Now my coding setup looks like this:

1. echokit_pty runs on my machine — Starts a WebSocket server (ws://localhost:3000/ws)

2. EchoKit Server connects to it — Handles speech recognition and text-to-speech

3. EchoKit Device sits on my desk — Listens for my voice, speaks back responses

My Voice: "Run the tests"

EchoKit Device (hears me)

EchoKit Server (transcribes speech)

echokit_pty (WebSocket connection)

Claude Code (executes the command)

Tests run, results stream back

EchoKit speaks: "142 tests passed, 3 failed"

All while I keep typing. No window switching. No flow breaking.

A Day in the Life

Let me show you what this actually feels like.

Morning: I sit down with coffee. "EchoKit, run the full test suite." I start reading emails while tests run in the background. Five minutes later: "Tests complete. Two failures in the auth module."

Afternoon: I'm stuck on a bug. "EchoKit, why is the login failing?" It explains the issue while I'm looking at the code. "Can you fix it?" "Done. Want me to run the tests?" "Yes."

Evening: I'm tired, don't want to type. "EchoKit, create a new feature branch called dark-mode." "Deploy staging." "Check if the build passed." Each command happens while I'm leaning back in my chair.

It feels like having a coding companion. Not a tool—a teammate.

Why This Matters

I know what you're thinking: Voice control for coding? Sounds weird. And doesn't Claude Code have Remote Control now?

You're right—it is weird at first. But here's the thing: Claude Code's Remote Control is great for mobile access, but EchoKit isn't your phone. It's a dedicated device that sits on your desk. Always on. Always listening. No unlocking, no apps, no picking it up.

Here's what I discovered:

It's not about voice typing. I'm not dictating code. That would be terrible.

It's about having a physical device. Think of it like a smart speaker for coding. It just sits there, ready to help. No screens to tap, no apps to open, no phone to find.

The magic is the always-there presence. The device lives on my desk. It's part of my workspace. I don't need to grab anything or unlock anything. I just speak.

It keeps me in the flow. That's the biggest one. I can stay focused on coding while EchoKit handles tasks in the background. It's like having a second pair of hands.

The Tech Behind the Magic

If you're curious how echokit_pty works technically, here's the short version:

PTY stands for "pseudo-terminal"—a Unix concept that lets a program control a terminal as if a user were typing. echokit_pty uses this to create a bridge between:

  • WebSocket clients → send JSON commands
  • Claude Code CLI → executes the commands
  • Response streaming → sends results back

It's built with Rust, runs locally, and is completely open source. No cloud required. Your code never leaves your machine.

But here's what I care about: it just works.

What You Can Do

So what does this actually look like in practice?

"Create a web page for me" → Claude Code generates the HTML, EchoKit confirms when done

"Run the tests" → Tests execute, EchoKit tells me the results

"Explain this error" → Claude Code analyzes, EchoKit reads the explanation

"Deploy to staging" → Deployment triggers, EchoKit confirms when complete

"Create a new branch" → Git command executes, no typing required

I can speak from across the room. Keep my hands on the keyboard while EchoKit works in the background. Get voice feedback without breaking my flow.

Building Your Own

This is the part I'm most excited about: everything here is open source.

  • EchoKit — Open hardware, Rust firmware, fully customizable
  • echokit_pty — Open source WebSocket interface for Claude Code
  • EchoKit Server — Rust-based voice AI server

You can build this yourself. Or modify it. Or extend it.

Want to add custom voice commands? Go ahead. Want to integrate with other tools? echokit_pty makes it possible. Want to build a completely different interface? The WebSocket is waiting.

The Future

This experiment showed me something: AI coding assistants can take many forms beyond screens and apps.

Claude Code's Remote Control solved mobile access. But what about specialized hardware? What about completely hands-free experiences? What about devices that do one thing perfectly?

echokit_pty is the bridge that makes these experiments possible. And EchoKit is just one example.

Imagine what else we could build:

  • Voice-controlled development environments
  • Specialized devices for specific workflows
  • Educational tools that feel like magic
  • Assistive technology for developers with disabilities

All built on top of echokit_pty's open WebSocket interface.

Try It Yourself

Ready to turn your AI assistant into a physical device?

Full Documentation: Remote Control Claude Code with Your Voice

EchoKit Hardware:

echokit_pty Repository: github.com/second-state/echokit_pty

Join the Community: EchoKit Discord

Build something cool. Then tell me about it.


PS: The first time I heard EchoKit say "Tests passed" while I was making coffee? That's when I knew this wasn't just a cool experiment. This was how I wanted to work from now on.

Introducing EchoKit Box

· 3 min read

A bigger screen. A cleaner design. A more powerful EchoKit.

We’re excited to introduce EchoKit Box, the newest member of the EchoKit family — built for makers, educators, and anyone exploring voice AI Agent.

EchoKit Box keeps everything people love about EchoKit, but elevates the hardware, polish, and usability in every way.

Full-Front 2.4-inch OLED Display

One of the most visible upgrades in EchoKit Box is its large full-front screen.

The entire front of the device is a high-contrast 2.4-inch OLED display, perfect for:

  • System information
  • Voice activity visualization
  • Playing videos stored on the TF card
  • Displaying graphics and custom UI
  • MCP-driven animations

Unlike the previous EchoKit generation, the visual feedback is clearer and more interactive, making this device suitable for both teaching and advanced AI projects.

Clearly Labeled Buttons (Including K0 and Reset)

Many users struggled to find the K0 button and reset button on the previous EchoKit DIY model. EchoKit Box solves this by placing integrated, clearly labeled buttons at the top of the device.

Clear hardware labeling = less confusion and faster development.

TF Card Slot for Media and Local AI Workflows

At the bottom of the device, you’ll find a TF card slot. You can store:

  • Music
  • Videos
  • Offline content
  • Custom datasets

And here’s where the fun begins:

You can ask the large language model to generate MCP actions that play music or video stored on the TF card — directly on the device.

That means you can say: “Play the music on my memory card.” And the device will play it through the speaker.

More Connectors for Additional Modules

On the side of the EchoKit Box, you’ll find two colored connectors (blue and red). These are expansion ports for sensors and modules, such as:

  • Temperature sensors
  • Cameras
  • LED light modules
  • GPIO-based sensors
  • Custom peripherals

Using MCP actions, the large language model can control these modules:

  • “Turn on the camera and take a picture.”
  • “Read temperature from the blue port sensor.”
  • “Switch on the LEDs.”

EchoKit Box becomes your modular AI platform, not just a single device.

Transparent Back With Visible Electronics

The back of EchoKit Box features a clear, transparent cover, allowing you to see:

  • The ESP32 CPU
  • PCB and circuitry
  • Speaker
  • Microphone
  • Components such as power regulators and drivers

Makers, students, and hardware enthusiasts love this design because it shows exactly how the AI device works internally.

This is especially useful for:

  • STEM education
  • AI education
  • AI Hardware demos
  • AI workshops
  • DIY repair and customization
  • Special gifts for developers

Why We Love the New EchoKit Box

After months of iteration, we truly believe EchoKit Box is the most advanced EchoKit we’ve ever built:

  • Bigger 2.4-inch display
  • Better enclosure and build quality
  • Clear hardware labeling
  • TF card slot
  • More connectors for sensors and modules
  • Transparent back in geek style for education
  • Dual USB ports for firmware flashing
  • Great speaker/mic setup
  • Fully open-source and ESP32 powered
  • Works perfectly with local LLMs and MCP actions

It’s a hackable voice AI device that’s also polished enough for demos, classrooms, hackathons, and real projects.

Final Thoughts

We’re really proud of the new EchoKit Box, and we think you’ll love building with it.

Whether you’re experimenting with conversational AI, creating an embedded chatbot, teaching students about LLMs, or building robotics projects with sensors, this device gives you everything you need.

Stay tuned — more updates, tutorials, and expansion modules are coming soon.

Try EchoKit’s fun AI Voices Free

· 2 min read

Have you ever wondered what your AI would sound like with a Southern drawl or a confident Texas accent?

Until now, these premium voices were paid add-ons — but now you can try them for free on the EchoKit web demo.

We’ve added diverse, natural accents including Southern, Asian, African-American, New York, and Texas English, bringing more authenticity and cultural depth to your conversations.

Each voice is expressive and warm, built to sound like a real person rather than a robotic assistant.

No installation or payment needed — just open the EchoKit web demo and start exploring: https://funvoice.echokit.dev/

How to Play 🎤

  1. Open https://funvoice.echokit.dev/ in your browser.
  2. Choose the accent you want to try from Cowboy, Diana, Asian, Pinup, or Stateman.
  3. Allow the website to access your microphone when prompted.

  1. Click on Start Listening.
  2. Once you see “WebSocket connected successfully”, start talking to the character — it will respond in the selected voice!
  3. If you just want to listen, click Stop Listening to pause microphone input.

How Did We Make It 🎛️

Want something truly personal?

EchoKit is an open-source voice AI agent that lets you customize every aspect of both the hardware and software stack. One of the most popular features is Voice Clone — you can even clone your own voice!

Ready to create a truly personal AI voice? Learn how to do it here: Voice Cloning Guide.

From Browser to Device

Once you’ve experimented in the browser, you can take it even further.
EchoKit lets you play with these voices locally, on-device, even using your own voice.
Perfect for makers, educators, and AI hobbyists who want full control and real-time interaction.

🎧 Try the voices → https://funvoice.echokit.dev/
🛠️ Get your own EchoKit device → https://echokit.dev/

EchoKit — the open-source Voice AI Agent that sounds just like you.

Have any questions? Join our Discord community