Clara Desktop — Real-Time Voice Requirements

Date: 2026-03-22 Author: Granville (Architect) Priority: P0 — Most important piece of functionality

What Mo Wants

Open Clara Desktop → talk to agents → real conversation (1-5 sec latency)
Quik gets the same experience on his machine
Conference mode — Mo and Quik both talking to agents
5-4-3-2-1 countdown audio while AI processes (no dead air)
Locked build for Quik (no DevTools, no source access)
Local server for Mo (zero network latency), EC2 for everyone else

Current State

Electron app EXISTS at /Volumes/X10-Pro/Native-Projects/apps/clara-desktop/
Voice server EXISTS at infrastructure/voice/server/
/voice-direct endpoint works (HTTP batch: upload audio → process → return)
Pipecat streaming pipeline in bot.py (switched to Groq, not deployed yet)
CSP blocking localhost connections (needs fix)
Dev mode works but has security warnings

Architecture

Voice Flow (Mo — Local)

Mic → Electron app → localhost:7860/voice-direct → Deepgram STT → Groq LLM → MiniMax TTS → Speakers
         ^                                                                                    |
         |_________________________ 5,4,3,2,1 countdown plays while waiting __________________|

Voice Flow (Quik — Remote)

Mic → Electron app → EC2 voice server/voice-direct → Deepgram STT → Groq LLM → MiniMax TTS → Speakers

Conference Mode

Mo's mic ──→ Daily.co room ←── Quik's mic
                 ↓
           Pipecat pipeline
      (Deepgram → Groq → MiniMax)
                 ↓
           Agents respond in room
                 ↓
Mo's speakers ←──────────→ Quik's speakers

TDD Approach — Test Before Code

Test 1: CSP allows localhost

Fix CSP in index.html AND forge.config.ts devContentSecurityPolicy
Verify: fetch('http://localhost:7860/agents') succeeds in renderer

Test 2: Mic capture works

Verify: MediaRecorder captures audio chunks
Verify: Blob size > 500 bytes after 3 seconds

Test 3: Voice-direct round trip

Send audio blob to /voice-direct
Verify: Response has transcript + response + audio_base64
Measure: Total latency < 5 seconds

Test 4: Audio playback

Decode base64 audio → Blob → Audio element
Verify: Audio plays without errors
Verify: Mic pauses during playback, resumes after

Test 5: Countdown plays during wait

Verify: Web Speech API starts “Got it. 5, 4, 3, 2, 1” immediately
Verify: Countdown cancels when response arrives
Verify: No overlap between countdown and response audio

Test 6: Locked mode

Build with CLARA_LOCKED=true
Verify: DevTools cannot open (Cmd+Shift+I, F12 blocked)
Verify: Right-click disabled
Verify: Source code not visible in .app bundle

Test 7: Conference mode

Two clients join same Daily.co room
Agent responds to both participants
Both hear the response

Known Issues to Fix

CSP in index.html needs http://localhost:* (already there but Forge webpack overrides it)
forge.config.ts devContentSecurityPolicy needs updating
Python 3.14 can’t build Pipecat (llvmlite) — use 3.12 venv
Voice server .env needs all API keys (added: Groq, Deepgram, MiniMax)
Renderer.ts just logs a wave emoji — needs actual voice logic or index.html handles it

Files to Modify

apps/clara-desktop/forge.config.ts — CSP fix
apps/clara-desktop/src/index.html — Voice UI + countdown
apps/clara-desktop/src/index.ts — Locked mode
infrastructure/voice/server/bot.py — Groq + thinking phrases (DONE)
infrastructure/voice/server/server.py — No changes needed
infrastructure/voice/server/requirements.txt — Updated (DONE)

Deploy Checklist

Fix CSP (forge.config.ts + index.html)
Test voice-direct round trip locally
Test countdown audio
Test locked mode build
Deploy updated bot.py + requirements to EC2
Package locked build for Quik
Test conference mode with Daily.co

Auset Brain

Explorer

clara-desktop-voice-requirements

Clara Desktop — Real-Time Voice Requirements

What Mo Wants

Current State

Architecture

Voice Flow (Mo — Local)

Voice Flow (Quik — Remote)

Conference Mode

TDD Approach — Test Before Code

Test 1: CSP allows localhost

Test 2: Mic capture works

Test 3: Voice-direct round trip

Test 4: Audio playback

Test 5: Countdown plays during wait

Test 6: Locked mode

Test 7: Conference mode

Known Issues to Fix

Files to Modify

Deploy Checklist

Graph View

Table of Contents