Two sources are supported: config.yaml takes precedence, .env is the fallback. The recommended
pattern keeps secrets in .env and references them from config.yaml with ${VAR} (the loader
expands them), so config lives in one declarative file.
Each config.yaml key has a matching env var (e.g. realtime.azure_endpoint ↔
TEAMS_VOICE_AZURE_ENDPOINT); config.yaml wins where both are set.
config.yaml
%LOCALAPPDATA%\hermes\config.yaml (Windows) / ~/.hermes/config.yaml:
plugins:
enabled:
- teams_voice
entries:
teams_voice:
config:
shared_secret: ${TEAMS_VOICE_SHARED_SECRET} # must equal the secret set in StandIn
host: 127.0.0.1
port: 8443
# Attach meeting-minutes .docx to the Teams chat (needs Graph
# Sites.ReadWrite.All on the bot app); omit for text-only minutes.
share_point_site_id: ${TEAMS_SHAREPOINT_SITE_ID}
realtime:
backend: azure
azure_endpoint: https://<your-azure-resource>.cognitiveservices.azure.com
azure_deployment: gpt-realtime
azure_api_version: 2025-04-01-preview
voice: cedar
api_key: ${AZURE_FOUNDRY_API_KEY} # secret stays in .env
vad_threshold: 0.5
prefix_padding_ms: 300
silence_duration_ms: 500
.env
%LOCALAPPDATA%\hermes\.env / ~/.hermes/.env - the secret store (used directly, or referenced above).
A fully env-only setup works too:
TEAMS_VOICE_SHARED_SECRET=... # must equal the secret set in StandIn
AZURE_FOUNDRY_API_KEY=... # realtime key (also used by the gateway)
# SharePoint (OneDrive) site for attaching files/minutes - host,siteGuid,webGuid
# (the bot AAD app needs Graph Sites.ReadWrite.All, admin-consented):
TEAMS_SHAREPOINT_SITE_ID=contoso.sharepoint.com,<siteGuid>,<webGuid>
# fully env-only is fine too:
TEAMS_VOICE_HOST=127.0.0.1
TEAMS_VOICE_PORT=8443
TEAMS_VOICE_REALTIME_BACKEND=azure
TEAMS_VOICE_AZURE_ENDPOINT=https://<your-azure-resource>.cognitiveservices.azure.com
TEAMS_VOICE_AZURE_DEPLOYMENT=gpt-realtime
TEAMS_VOICE_AZURE_API_VERSION=2025-04-01-preview
TEAMS_VOICE_REALTIME_VOICE=cedar
backend: openai uses public OpenAI instead of Azure - set the OpenAI key in place of the Azure
endpoint/deployment fields.
For the hosted StandIn bridge to connect, set allow_remote_worker: true and bind a
reachable address (host: 0.0.0.0 or a tunnel) - the default 127.0.0.1 only accepts a local bridge.
Key reference
Hermes implements the same feature set as the OpenClaw plugin; keys are snake_case and each has a
matching TEAMS_VOICE_* env var.
| Key | Meaning |
|---|
shared_secret | HMAC secret - must match the secret set in StandIn (keep in .env) |
host / port / path | media WebSocket the StandIn bridge connects to (default 127.0.0.1 : 8443, /voice/msteams/stream) |
allow_remote_worker | true for hosted StandIn - accept a non-localhost bridge (default false) |
require_recording_status | only engage once Teams reports recording active (default true) |
allowlist / allowlist_allow_names | allowlisted caller AAD ids (closed when set); also match by display name |
wake_phrases | group-call “speak only when addressed” wake words (default assistant, hermes) |
max_vision_per_minute | vision spend cap (default 30, 0 = unlimited) |
session_scope | per-call | per-thread | per-aad agent-memory scope |
meeting_recap | post end-of-meeting minutes (default false) |
share_point_site_id | SharePoint site for .docx minutes attach (needs Sites.ReadWrite.All) |
worker_base_url / tenant_id | outbound “call me back” - StandIn’s outbound endpoint + your tenant |
realtime.{backend,azure_endpoint,azure_deployment,voice,api_key,vad_threshold,…} | realtime backend (azure | openai); key stays in .env |
hmac_window_ms / max_connections / max_connections_per_ip | handshake window + connection caps |
Run
hermes teams-voice status # show config + readiness
hermes teams-voice serve --handler realtime # or: streaming (needs ffmpeg)
# or standalone:
python -m hermes_teams_voice.bridge_server
Register your plugin’s WebSocket (ws://<host>:8443/voice/msteams/stream) and a matching shared
secret in your StandIn dashboard.
Microsoft Graph permissions
The bot’s Azure AD app needs these application permissions (admin-consented):
| Permission | Enables |
|---|
Calls.JoinGroupCall.All | answer / join Teams calls and meetings |
Calls.AccessMedia.All | access real-time Teams call audio/video media |
Chat.Read.All | resolve chat / thread ids and read message context |
ChatMessage.Read.Chat | read messages in chats the bot is installed in |
Sites.ReadWrite.All | upload files / minutes to SharePoint (OneDrive) for chat attachments |
Calls.InitiateGroupCall.All | outbound “call me back” (skip if unused) |
The shared_secret must byte-match the secret set in StandIn, or the HMAC handshake fails and no
call connects.