Skip to main content

EchoKit server config options

The EchoKit server orchestrates multiple AI services to turn user voice input into voice responses. It generally takes two approaches.

The pipeline approach offers greater flexibility and customization - you can choose any voice, control costs by mixing different providers, integrate external knowledge, and run components locally for privacy. While end-to-end models can reduce the latency, the classic pipeline gives you full control over each component.

You can configure how those AI services work together through EchoKit server's config.toml file.

Prerequisites

  • Started an EchoKit server. Follow the quick start guide if needed
  • Obtained API keys for your favoriate AI API providers (OpenAI, Groq, xai, Open Router, ElevenLabs, Gemini etc.)

Configure server address and welcome audio

addr = "0.0.0.0:8080"
hello_wav = "hello.wav"
  • addr: The server's listening address and port
    • Use 0.0.0.0 to accept connections from any network interface
    • Make sure that your firewall allows incoming connections to the port (8080 in this example)
  • hello_wav: Optional welcome audio file played when a device connects
    • Supports 16kHz WAV format
    • Make sure that the file is in the same folder as config.toml

Configure AI services

The rest of the config.toml specifies how to use different AI services. Each service will be covered in its own chapter.

It is important to note that each of sections has those fields.

  • A platform field that designates the service protocol. A common example is openai for OpenAI compatible API endpoints.
  • A url field for the service URL endpoint. It is typically a https:// or wss:// URL. The latter is the Web Socket address for streaming services.
  • Optional fields that are specific to the platform. That includes api_key, model, and others.

Complete Configuration Example

You will need a free API key from Groq.

# Server settings
addr = "0.0.0.0:8080"
hello_wav = "hello.wav"

# Speech recognition using the OpenAI transcriptions API, but hosted by Groq (instead of OpenAI)
[asr]
platform = "openai"
url = "https://api.groq.com/openai/v1/audio/transcriptions"
lang = "en"
api_key = "gsk_your_api_key_here"
model = "whisper-large-v3-turbo"

# Language model using the OpenAI chat completions API, but hosted by Groq (instead of OpenAI)
[llm]
platform = "openai_chat"
url = "https://api.groq.com/openai/v1/chat/completions"
api_key = "gsk_your_api_key_here"
model = "gpt-oss-20b"
history = 10

# Text-to-speech using the OpenAI speech API, but hosted by Groq (instead of OpenAI)
[tts]
platform = "openai"
url = "https://api.groq.com/openai/v1/audio/speech"
api_key = "gsk_your_api_key_here"
model = "playai-tts"
voice = "Cooper-PlayAI"

# System personality
[[llm.sys_prompts]]
role = "system"
content = """
Your name is EchoKit, a helpful AI assistant. Provide clear, concise responses and maintain a friendly, professional tone. Keep answers brief but informative.
"""