Hermes — the call surface
Hermes carries the telephony, manages turn-taking, handles consent at call start, records the audio with regional-jurisdiction handling, provides sentiment signal alongside the transcript, and exposes the entire call surface to the operator container via webhooks and a real-time stream.
ElevenLabs — the voice
ElevenLabs synthesises the operator's voice from a calibrated brand-tone profile. We pre-cache common phrases (greetings, confirmations, escalation handoffs) so the marginal cost of synthesis tracks unique spoken content rather than total speech.
Anthropic Claude — reasoning
Claude does the planning and the decision. We pin the model version per workflow. Claude's tool-use predictability on multi-turn workflows is the practical reason it is our voice default; in production an unpredictable reasoning core makes voice calls feel broken even when the surface technology is good.
Glama — the tool gateway
Every downstream tool — the CRM, the billing system, the scheduling platform, the ticketing system — is exposed to the operator over Glama's MCP. Glama holds the scoped credentials, applies rate limits, and writes a tool-call audit feed straight into MongoDB.
Stripe — money movement
When a workflow involves payment, Stripe payment intents are the action the operator takes. The intent is idempotent so a retry never doubles a charge. The intent ID lives in the MongoDB audit record next to the call ID, so finance reconciliation is a cross-join.
Supabase — memory
Per-account state — payer tier, prior promises, communication preferences, last contact — lives in Supabase. The operator reads it at call open and writes back at call close.
MongoDB — audit
Append-only record of every action plus the reasoning that produced it, plus references to the audio file the action came from. The audit log is the single artifact that lets the workflow owner trust the operator over time.