8 posts tagged with "echokit"

the article is about EchoKit

My Coding Assistant Lives in a Box Now | EchoKit

February 24, 2026 · 7 min read

It was 2 AM. I was deep in a coding session, fingers flying across the keyboard, completely in the zone. Then I hit a bug. I needed to run the tests.

Which meant breaking my flow. Switching windows. Typing the command. Waiting. Switching back.

I thought: What if I could just say it?

Not into my phone. Not unlocking an app. Just speak—to a device sitting on my desk.

A Small Device, Big Idea

That moment sparked an experiment. What if my AI coding assistant wasn't trapped in a terminal window, but lived in a small device on my desk? What if I could speak to it like a pair programmer sitting next to me?

Not voice typing—I hate that. But voice commands. Like having a junior developer who actually does things, not just suggests them.

So I built it.

Today, I'm excited to share how EchoKit became a voice remote control for Claude Code. And why this changes everything about how I work.

It Started with a Problem

Claude Code is amazing. It writes code, fixes bugs, runs tests, explains errors.

Yes, Claude Code now has an official Remote Control feature for mobile and web access. But it's designed for phones and browsers—not for hands-free voice control or physical devices. You still need to look at a screen and tap buttons.

I wanted something different. Something that felt like... magic.

The Missing Piece

I had EchoKit—my open-source voice AI device sitting on my desk. It can hear me, think, and respond. But it couldn't control my code editor.

I needed a bridge.

That bridge is called echokit_pty.

What is echokit_pty? It's the web version of Claude Code, but with a superpower: a WebSocket interface.

See, Claude Code was designed as a CLI tool. You run it in your terminal, type commands, get responses. That's great for terminal workflows. But for voice control? For remote access? For building anything on top of Claude Code?

You need something more.

echokit_pty is that "more."

How echokit_pty Changed Everything

Here's what echokit_pty does: it takes Claude Code and exposes it through a WebSocket server. Suddenly, Claude Code isn't just a terminal app—it's a service that anything can talk to.

My EchoKit device can send commands. A web app could send commands. A mobile app. A game controller. Anything that speaks WebSocket.

But here's the beautiful part: it's still Claude Code. All the capabilities, all the intelligence, everything that makes Claude Code amazing—just accessible through a clean, simple interface.

The Setup: Three Pieces, One Experience

Now my coding setup looks like this:

1. echokit_pty runs on my machine — Starts a WebSocket server (ws://localhost:3000/ws)

2. EchoKit Server connects to it — Handles speech recognition and text-to-speech

3. EchoKit Device sits on my desk — Listens for my voice, speaks back responses

My Voice: "Run the tests"
    ↓
EchoKit Device (hears me)
    ↓
EchoKit Server (transcribes speech)
    ↓
echokit_pty (WebSocket connection)
    ↓
Claude Code (executes the command)
    ↓
Tests run, results stream back
    ↓
EchoKit speaks: "142 tests passed, 3 failed"

All while I keep typing. No window switching. No flow breaking.

A Day in the Life

Let me show you what this actually feels like.

Morning: I sit down with coffee. "EchoKit, run the full test suite." I start reading emails while tests run in the background. Five minutes later: "Tests complete. Two failures in the auth module."

Afternoon: I'm stuck on a bug. "EchoKit, why is the login failing?" It explains the issue while I'm looking at the code. "Can you fix it?" "Done. Want me to run the tests?" "Yes."

Evening: I'm tired, don't want to type. "EchoKit, create a new feature branch called dark-mode." "Deploy staging." "Check if the build passed." Each command happens while I'm leaning back in my chair.

It feels like having a coding companion. Not a tool—a teammate.

Why This Matters

I know what you're thinking: Voice control for coding? Sounds weird. And doesn't Claude Code have Remote Control now?

You're right—it is weird at first. But here's the thing: Claude Code's Remote Control is great for mobile access, but EchoKit isn't your phone. It's a dedicated device that sits on your desk. Always on. Always listening. No unlocking, no apps, no picking it up.

Here's what I discovered:

It's not about voice typing. I'm not dictating code. That would be terrible.

It's about having a physical device. Think of it like a smart speaker for coding. It just sits there, ready to help. No screens to tap, no apps to open, no phone to find.

The magic is the always-there presence. The device lives on my desk. It's part of my workspace. I don't need to grab anything or unlock anything. I just speak.

It keeps me in the flow. That's the biggest one. I can stay focused on coding while EchoKit handles tasks in the background. It's like having a second pair of hands.

The Tech Behind the Magic

If you're curious how echokit_pty works technically, here's the short version:

PTY stands for "pseudo-terminal"—a Unix concept that lets a program control a terminal as if a user were typing. echokit_pty uses this to create a bridge between:

WebSocket clients → send JSON commands
Claude Code CLI → executes the commands
Response streaming → sends results back

It's built with Rust, runs locally, and is completely open source. No cloud required. Your code never leaves your machine.

But here's what I care about: it just works.

What You Can Do

So what does this actually look like in practice?

"Create a web page for me" → Claude Code generates the HTML, EchoKit confirms when done

"Run the tests" → Tests execute, EchoKit tells me the results

"Explain this error" → Claude Code analyzes, EchoKit reads the explanation

"Deploy to staging" → Deployment triggers, EchoKit confirms when complete

"Create a new branch" → Git command executes, no typing required

I can speak from across the room. Keep my hands on the keyboard while EchoKit works in the background. Get voice feedback without breaking my flow.

Building Your Own

This is the part I'm most excited about: everything here is open source.

EchoKit — Open hardware, Rust firmware, fully customizable
echokit_pty — Open source WebSocket interface for Claude Code
EchoKit Server — Rust-based voice AI server

You can build this yourself. Or modify it. Or extend it.

Want to add custom voice commands? Go ahead. Want to integrate with other tools? echokit_pty makes it possible. Want to build a completely different interface? The WebSocket is waiting.

The Future

This experiment showed me something: AI coding assistants can take many forms beyond screens and apps.

Claude Code's Remote Control solved mobile access. But what about specialized hardware? What about completely hands-free experiences? What about devices that do one thing perfectly?

echokit_pty is the bridge that makes these experiments possible. And EchoKit is just one example.

Imagine what else we could build:

Voice-controlled development environments
Specialized devices for specific workflows
Educational tools that feel like magic
Assistive technology for developers with disabilities

All built on top of echokit_pty's open WebSocket interface.

Try It Yourself

Ready to turn your AI assistant into a physical device?

Full Documentation: Remote Control Claude Code with Your Voice

EchoKit Hardware:

EchoKit Box — Pre-assembled device
EchoKit DIY Kit — Build it yourself

echokit_pty Repository: github.com/second-state/echokit_pty

Join the Community: EchoKit Discord

Build something cool. Then tell me about it.

PS: The first time I heard EchoKit say "Tests passed" while I was making coffee? That's when I knew this wasn't just a cool experiment. This was how I wanted to work from now on.

End-to-End vs. ASR-LLM-TTS: Which One Is The Right Choice to Build Voice AI Agent?

December 8, 2025 · 5 min read

The race to build the perfect Voice AI Agent has primarily split into two lanes: the seamless, ultra-low latency End-to-End (E2E) model (like Gemini Live), and the highly configurable ASR-LLM-TTS modular pipeline. While the speed and fluidity of the End-to-End approach have garnered significant attention, we argue that for enterprise-grade applications, the modular ASR-LLM-TTS architecture provides the strategic advantage of control, customization, and long-term scalability.

This is not simply a technical choice; it is a business decision that determines whether your AI Agent will be a generic tool or a highly specialized, branded extension of your operations.

The Allure of the Integrated Black Box (Low Latency, High Constraint)

End-to-End models are technologically impressive. By integrating the speech-to-text (ASR), large language model (LLM), and text-to-speech (TTS) functions into a single system, they achieve significantly lower latency compared to pipeline systems. The resulting conversation feels incredibly fluid, with minimal pauses—an experience that is highly compelling in demonstrations.

However, this integration creates a “black box”. Once the user's voice enters the system, you lose visibility and the ability to intervene until the synthesized voice comes out. For general consumer-grade assistants, this simplification works. But for companies with specialized vocabulary, unique brand voices, and strict compliance needs, simplicity comes at the cost of surgical control.

Lessons Learned from the Front Lines: The Echokit Experience

Our understanding of this architectural divide is forged through experience building complex, scalable voice platforms. In the early days of advanced voice interaction—systems like echokit—we tackled the challenge of delivering functional, high-quality, and reliable Voice AI using the available modular components.

These pioneering efforts, long before current E2E models were mainstream, taught us a crucial lesson: The ability to inspect, isolate, and optimize each stage (ASR, NLU/LLM, TTS) is non-negotiable for achieving enterprise-level accuracy and customization. We realized that while assembling the pipeline was complex, the resulting control over domain-specific accuracy, language model behavior, and distinct voice output ultimately delivered superior business results and a truly unique brand experience.

More importantly, EchoKit, which is open source, ensures complete transparency and adaptability.

The Power of the Modular Pipeline: Control and Precision (Higher Latency, Full Control)

The ASR-LLM-TTS pipeline breaks the Voice AI process down into three discrete, controllable stages. While this sequential process often results in higher overall latency compared to E2E solutions, this modularity is a deliberate architectural choice that grants businesses the power to optimize every single touchpoint.

ASR (Acoustic and Language Model Fine-tuning): You can specifically train the ASR component on your industry jargon, product names, or regional accents. This is crucial in sectors like finance, healthcare, or manufacturing, where misrecognition of a single term can be disastrous. The pipeline allows you to correct ASR errors before they even reach the LLM, ensuring higher fidelity input.
LLM (Knowledge Injection and Logic Control): This is the brain. You have the flexibility to swap out the LLM (whether it's GPT, Claude, or a custom model) and deeply integrate your proprietary knowledge bases (RAG), MCP servers, business rules, and specific workflow logic. You maintain complete control over the reasoning path and ensure responses are accurate and traceable.
TTS (Brand Voice and Emotional Context): This is the face and personality of your brand. You can select, fine-tune, or even clone a unique voice that perfectly matches your brand identity, adjusting emotional tone and pacing. Your agent should sound like your company, not a generic robot.

Voice AI Architecture Comparison: E2E vs. ASR-LLM-TTS

The choice boils down to a fundamental trade-off between Latency vs. Customization.

Feature	End-to-End (E2E) Model (e.g., Gemini Live)	ASR-LLM-TTS Pipeline (Modular)
Primary Advantage	Ultra-Low Latency & Fluidity. Excellent for fast, generic conversation.	Maximum Customization & Control. Optimized for business value.
Latency	Significantly Lower. Integrated processing minimizes delays.	Generally Higher. Sequential processing introduces latency between stages.
Architecture	Integrated Black Box. All components merged.	Three Discrete Modules. ASR $\to$ LLM $\to$ TTS.
Customization	Low. Limited ability to adjust individual components or voices.	High. Each module can be independently trained and swapped.
Brand Voice	Limited. Locked to vendor's available TTS options.	Full Control. Can implement custom voice cloning and precise emotion tagging.
Optimization Path	All-or-Nothing. Optimization requires waiting for the vendor to update the entire model.	Component-Specific. Allows precise fixes and continuous improvement on any single module.
Strategic Lock-in	High. Tightly bound to the single End-to-End vendor/platform.	Low. Flexibility to integrate best-of-breed components from different vendors.

The Verdict: Choosing a Strategic Asset

While the ultra-low latency of an End-to-End agent is undoubtedly attractive, it is crucial to ask: Does speed alone deliver business value?

For most enterprise use cases—where the Agent handles critical customer service, sales inquiries, or technical support—the ability to be accurate, on-brand, and deeply integrated is far more valuable than shaving milliseconds off the response time.

The ASR-LLM-TTS architecture, validated by our experience with systems like echokit, is the strategic choice because it treats the Voice AI Agent not as a simple conversational tool, but as a controllable, customizable, and continuously optimizable business asset. By opting for modularity, you retain the control necessary to adapt to market changes, ensure data compliance, and, most importantly, deliver a unique and expert-level experience that truly reflects your brand.

Which solution delivers the highest long-term ROI and the strongest brand experience? The answer is clear: Control is the key to enterprise Voice AI.

EchoKit Update in Nov: Firmware & Server Improvements

December 2, 2025 · 3 min read

We’re excited to share the latest updates of EchoKit in Nov, our open-source voice AI kit for makers, developers, students. These updates introduce new features in both the firmware and server, making it easier than ever to set up your device and customize its behavior.

Firmware Update

The latest firmware brings several user-friendly improvements:

One-Click Wi-Fi & Server Setup All configuration options—including Wi-Fi credentials and server URL—are now bundled into a single setup interface when connecting the EchoKit Server to your device. Click one button - Save Configurations, and your device will automatically save the settings, restart, and apply the new configuration. See details here.
Version Display You can now easily check your EchoKit firmware version on the device, helping you keep track of updates.
EchoKit Box Volume Adjustment Adjust the volume directly on your EchoKit Box for a better audio experience without extra steps.
- K2 to lower the volume
- K1 to increase the volume

Server Update

The EchoKit server has also received key improvements:

Dynamic Prompt Loading via URL

Prompts define how the AI responds, and with the growing ecosystem of open-source LLM prompts, there’s a wealth of ready-to-use content. For example, websites like LLMs.txt host thousands of prompts for various AI models and use cases. With dynamic prompt loading, you can point EchoKit to these URLs and experiment with different personalities, knowledge bases, or conversation styles in seconds.

You can now load prompts dynamically from a URL, allowing you to:
- Update the AI’s behavior remotely
- Test new conversation flows without restarting the server
- Quickly iterate on experiments and demos
Learn more from the doc: https://echokit.dev/docs/server/dynamic-system
Add a Wait Message for MCP Tools When calling MCP tools, a “please wait” message will now appear, providing clear feedback while operations are in progress.

How to Get These New Features

Firmware Update

Download the latest firmware from EchoKit Firmware Page
Flash the firmware to your device using the ESP32 Launchpad or CLI command line
Your device will now support one-click setup, version display, and volume adjustment for EchoKit Box

Server Update

Get the latest EchoKit server: https://github.com/second-state/echokit_server/releases
Run the latest EchoKit server with docker or from Rust source code
You’ll get dynamic prompt loading and wait messages for MCP tools

Once your device and server are updated, all new features will be immediately available.

These updates are part of our ongoing effort to make EchoKit more user-friendly, flexible, and powerful. Whether you’re a maker experimenting with AI at home or a developer building advanced voice interactions, these improvements make it easier to focus on what matters: creating amazing experiences.

Stay tuned for more updates, and happy tinkering with EchoKit!

Introducing EchoKit Box

November 17, 2025 · 3 min read

A bigger screen. A cleaner design. A more powerful EchoKit.

We’re excited to introduce EchoKit Box, the newest member of the EchoKit family — built for makers, educators, and anyone exploring voice AI Agent.

EchoKit Box keeps everything people love about EchoKit, but elevates the hardware, polish, and usability in every way.

Full-Front 2.4-inch OLED Display

One of the most visible upgrades in EchoKit Box is its large full-front screen.

The entire front of the device is a high-contrast 2.4-inch OLED display, perfect for:

System information
Voice activity visualization
Playing videos stored on the TF card
Displaying graphics and custom UI
MCP-driven animations

Unlike the previous EchoKit generation, the visual feedback is clearer and more interactive, making this device suitable for both teaching and advanced AI projects.

Clearly Labeled Buttons (Including K0 and Reset)

Many users struggled to find the K0 button and reset button on the previous EchoKit DIY model. EchoKit Box solves this by placing integrated, clearly labeled buttons at the top of the device.

Clear hardware labeling = less confusion and faster development.

TF Card Slot for Media and Local AI Workflows

At the bottom of the device, you’ll find a TF card slot. You can store:

Music
Videos
Offline content
Custom datasets

And here’s where the fun begins:

You can ask the large language model to generate MCP actions that play music or video stored on the TF card — directly on the device.

That means you can say: “Play the music on my memory card.” And the device will play it through the speaker.

More Connectors for Additional Modules

On the side of the EchoKit Box, you’ll find two colored connectors (blue and red). These are expansion ports for sensors and modules, such as:

Temperature sensors
Cameras
LED light modules
GPIO-based sensors
Custom peripherals

Using MCP actions, the large language model can control these modules:

“Turn on the camera and take a picture.”
“Read temperature from the blue port sensor.”
“Switch on the LEDs.”

EchoKit Box becomes your modular AI platform, not just a single device.

Transparent Back With Visible Electronics

The back of EchoKit Box features a clear, transparent cover, allowing you to see:

The ESP32 CPU
PCB and circuitry
Speaker
Microphone
Components such as power regulators and drivers

Makers, students, and hardware enthusiasts love this design because it shows exactly how the AI device works internally.

This is especially useful for:

STEM education
AI education
AI Hardware demos
AI workshops
DIY repair and customization
Special gifts for developers

Why We Love the New EchoKit Box

After months of iteration, we truly believe EchoKit Box is the most advanced EchoKit we’ve ever built:

Bigger 2.4-inch display
Better enclosure and build quality
Clear hardware labeling
TF card slot
More connectors for sensors and modules
Transparent back in geek style for education
Dual USB ports for firmware flashing
Great speaker/mic setup
Fully open-source and ESP32 powered
Works perfectly with local LLMs and MCP actions

It’s a hackable voice AI device that’s also polished enough for demos, classrooms, hackathons, and real projects.

Final Thoughts

We’re really proud of the new EchoKit Box, and we think you’ll love building with it.

Whether you’re experimenting with conversational AI, creating an embedded chatbot, teaching students about LLMs, or building robotics projects with sensors, this device gives you everything you need.

Stay tuned — more updates, tutorials, and expansion modules are coming soon.

Try EchoKit’s fun AI Voices Free

November 2, 2025 · 2 min read

Have you ever wondered what your AI would sound like with a Southern drawl or a confident Texas accent?

Until now, these premium voices were paid add-ons — but now you can try them for free on the EchoKit web demo.

We’ve added diverse, natural accents including Southern, Asian, African-American, New York, and Texas English, bringing more authenticity and cultural depth to your conversations.

Each voice is expressive and warm, built to sound like a real person rather than a robotic assistant.

No installation or payment needed — just open the EchoKit web demo and start exploring: https://funvoice.echokit.dev/

How to Play 🎤

Open https://funvoice.echokit.dev/ in your browser.
Choose the accent you want to try from Cowboy, Diana, Asian, Pinup, or Stateman.
Allow the website to access your microphone when prompted.

Click on Start Listening.
Once you see “WebSocket connected successfully”, start talking to the character — it will respond in the selected voice!
If you just want to listen, click Stop Listening to pause microphone input.

How Did We Make It 🎛️

Want something truly personal?

EchoKit is an open-source voice AI agent that lets you customize every aspect of both the hardware and software stack. One of the most popular features is Voice Clone — you can even clone your own voice!

Ready to create a truly personal AI voice? Learn how to do it here: Voice Cloning Guide.

From Browser to Device

Once you’ve experimented in the browser, you can take it even further.
EchoKit lets you play with these voices locally, on-device, even using your own voice.
Perfect for makers, educators, and AI hobbyists who want full control and real-time interaction.

🎧 Try the voices → https://funvoice.echokit.dev/
🛠️ Get your own EchoKit device → https://echokit.dev/

EchoKit — the open-source Voice AI Agent that sounds just like you.

Have any questions? Join our Discord community

New EchoKit Update: Button Interrupt and Volume Control Are Here!

October 28, 2025 · 2 min read

We’ve just released new versions of EchoKit Server (0.1.2) and EchoKit Firmware, bringing you more natural voice interactions than before.

EchoKit Server: https://github.com/second-state/echokit_server/
EchoKit Firmware: https://github.com/second-state/echokit_box

Button Interrupt

You can now interrupt EchoKit’s speech with a simple K0 button press, which are located on the left side of the EchoKit device.

This makes your voice assistant feel more responsive — no need to wait until it finishes talking. Just press the button and start speaking right away!

Adjustable Volume

Need to make EchoKit quieter? The speaker used to be so loud we couldn’t even test it at night!

You can now adjust the speaker volume directly on the device, giving you full control of your experience.

The volume buttons are located on the right side of the device:

The top button increases the volume.
The bottom button lowers the volume.

This makes it easy to get the perfect sound level, anytime.

🚀 How to Update

Download the latest version of Firmware from our ESP32 LaunchPad.
Download the latest version of the server from EchoKit GitHub release page and rerun it.
Flash the firmware to your device.
Reconnect the server and device.
- If you’re using the pre-set server provided by the EchoKit team, there’s nothing extra you need to do — the official server has already been updated to the latest version.
You’re ready to go — enjoy your new interactive voice experience!

Have any questions? Join our Discord community

EchoKit Now Supports ElevenLabs for High-Quality Voice Generation

October 19, 2025 · 2 min read

We’re excited to share a new update — EchoKit now supports ElevenLabs, one of the most advanced voice synthesis platforms in the world. This means your EchoKit can now speak with natural, expressive, and human-like voices in multiple languages and styles.

What’s New

With ElevenLabs integration, EchoKit users can:

Generate lifelike speech with rich tone and emotion
Choose from dozens of AI voices or create your own
Support multi-language and multilingual voice output
Combine with local AI models for smarter, private conversations

Whether you’re building a smart home assistant, a talking robot, or an AI tutor, 11labs voices make your EchoKit sound more alive and engaging.

How It Works

Using ElevenLabs voices with EchoKit is simple! All you need to do is configure your TTS parameters in the config.toml file.

Get your API key from ElevenLabs.
Choose a voice model from ElevenLabs and note its Voice ID.
Update your config.toml file like this:

[tts]
platform = "Elevenlabs"
token = "YOUR_API_KEY_HERE"
voice = "VOICE_ID_HERE"

Save the file and rerun your EchoKit server.
Reconnect your device to the server.

Why It Matters

EchoKit’s mission is to help everyone build and own their own AI voice agent. With the power of ElevenLabs, you can now customize the voice with ease.

Try It Today

Update your EchoKit server to the latest version and experience the new generation of AI voice synthesis. If you haven’t tried EchoKit yet, get one now to build your own voice AI agent at home.

Introducing EchoKit: Build, Learn, and Play with AI

July 25, 2025 · 4 min read

Artificial intelligence is no longer science fiction—it’s part of everyday life. From classrooms to workplaces, AI tools like ChatGPT and Gemini are being used by millions. But here’s the challenge: most people only interact with these systems as black boxes.

If we want to not just use AI, but to understand, customize, and innovate with it, we need tools that make AI tangible.

That’s why we created EchoKit — an open-source voice AI toolkit that makes learning AI as hands-on as building with LEGO.

What is EchoKit?

EchoKit is a open-source hardware and software toolkit for building and understanding modern AI voice agents.

Out of the box, EchoKit is a functional voice AI device—a companion you can talk to immediately.
But its real value lies in what’s inside: a modular hardware kit, open-source firmware, and an extensible AI server that together let you learn and experiment with every layer of the system.

With EchoKit, learners and educators can:

Explore modular hardware design, from microphones and speakers to ESP32-based processors.
Customize firmware written in Rust and re-flash the device to change how it behaves.
Run an AI server that connects to OpenAI, Gemini, or local open-source models for speech recognition, text generation, and voice synthesis.
Experiment with speech-to-text (ASR), large language models (LLMs), and text-to-speech (TTS) pipelines in a real system.
Build and integrate MCP tools (e.g., knowledge bases, search, or smart-home control) so that the AI agent can perform meaningful actions.
Learn how voice cloning, accents, and fine-tuned TTS models work, and try personalizing your own agent’s voice.
Set up local and private AI inference to understand how open-source models like Whisper and Llama can run on your own computer.
Follow structured guides that gradually explain AI concepts—from neural networks and embeddings to real-time systems—while encouraging experimentation.

In other words, EchoKit is not just a gadget—it is a practical curriculum in a box, designed to bring AI education to life.

Who is it For?

EchoKit is designed for a wide range of learners:

Students — Gain hands-on experience with AI that goes far beyond using apps. Build systems, break them apart, and learn how they work.
Teachers & Schools — Bring AI into the classroom with a platform that combines hardware, software, and clear documentation.
Parents — Provide your children with a meaningful project that blends fun, creativity, and real technical skills.
Technologists & Hobbyists — Experiment with AI voice agents as if they were Lego blocks. Modify, extend, and integrate EchoKit into your own projects.
Entrepreneurs — Prototype AI-powered products quickly, on top of a fully customizable and open-source foundation.

Why It Matters

According to a 2025 Pew survey, over 80% of American students already use large language models (LLMs) for schoolwork. Yet few understand how these systems actually function.

As Nvidia’s Jensen Huang put it:

“You won’t lose your job to AI—you’ll lose your job to somebody who uses AI.”

We believe the future belongs to those who don’t just use AI, but who can build and shape it. EchoKit helps bridge that gap by making AI education hands-on, practical, and open-source.

Join Us on Indiegogo

EchoKit is more than a device—it’s a platform for learning, creating, and teaching AI in a way that is open, transparent, and fun.

We’re now in our prelaunch phase on Indiegogo. By joining, you’ll:

Be among the first to access EchoKit when it launches.
Receive exclusive 48% off

👉 Join our Discord server and be part of the journey to bring hands-on AI education to everyone.

A Small Device, Big Idea​

It Started with a Problem​

The Missing Piece​

How echokit_pty Changed Everything​

The Setup: Three Pieces, One Experience​

A Day in the Life​

Why This Matters​

The Tech Behind the Magic​

What You Can Do​

Building Your Own​

The Future​

Try It Yourself​

The Allure of the Integrated Black Box (Low Latency, High Constraint)​

Lessons Learned from the Front Lines: The Echokit Experience​

The Power of the Modular Pipeline: Control and Precision (Higher Latency, Full Control)​

Voice AI Architecture Comparison: E2E vs. ASR-LLM-TTS​

The Verdict: Choosing a Strategic Asset​

Firmware Update​

Server Update​

How to Get These New Features​

Firmware Update​

Server Update​

Full-Front 2.4-inch OLED Display​

Clearly Labeled Buttons (Including K0 and Reset)​

TF Card Slot for Media and Local AI Workflows​

More Connectors for Additional Modules​

Transparent Back With Visible Electronics​

Why We Love the New EchoKit Box​

Final Thoughts​

How to Play 🎤​

How Did We Make It 🎛️​

From Browser to Device​

Button Interrupt​

Adjustable Volume​

🚀 How to Update​

What’s New​

How It Works​

Why It Matters​

Try It Today​

What is EchoKit?​

Who is it For?​

Why It Matters​

Join Us on Indiegogo​

A Small Device, Big Idea

It Started with a Problem

The Missing Piece

How echokit_pty Changed Everything

The Setup: Three Pieces, One Experience

A Day in the Life

Why This Matters

The Tech Behind the Magic

What You Can Do

Building Your Own

The Future

Try It Yourself

The Allure of the Integrated Black Box (Low Latency, High Constraint)

Lessons Learned from the Front Lines: The Echokit Experience

The Power of the Modular Pipeline: Control and Precision (Higher Latency, Full Control)

Voice AI Architecture Comparison: E2E vs. ASR-LLM-TTS

The Verdict: Choosing a Strategic Asset

Firmware Update

Server Update

How to Get These New Features

Firmware Update

Server Update

Full-Front 2.4-inch OLED Display

Clearly Labeled Buttons (Including K0 and Reset)

TF Card Slot for Media and Local AI Workflows

More Connectors for Additional Modules

Transparent Back With Visible Electronics

Why We Love the New EchoKit Box

Final Thoughts

How to Play 🎤

How Did We Make It 🎛️

From Browser to Device

Button Interrupt

Adjustable Volume

🚀 How to Update

What’s New

How It Works

Why It Matters

Try It Today

What is EchoKit?

Who is it For?

Why It Matters

Join Us on Indiegogo