Skip to main content
Config lives under plugins.entries."msteams-voice".config in your OpenClaw config. sharedSecret must match the secret you set in StandIn for the hosted bridge that connects to this plugin’s media WebSocket.

Mode selection

Set mode to "realtime" or "streaming". If omitted, the runtime auto-selects realtime when a realtime provider resolves, else streaming. Both modes honor the inbound allowlist, outbound call-backs, recording gate, and sessionScope agent memory. See Modes.
{
  "plugins": {
    "entries": {
      "msteams-voice": {
        "config": {
          "enabled": true,
          "mode": "realtime",
          "port": 9442,
          "path": "/voice/msteams/stream",
          "sharedSecret": "<same secret as in your StandIn dashboard>",
          "requireRecordingStatus": true,
          "inboundPolicy": "allowlist",
          "allowFrom": ["<caller AAD object id or phone number>"],
          "inboundGreeting": "Hello, this is the assistant.",
          "maxConcurrentCalls": 4,
          "maxDurationSeconds": 3600,
          "groupCall": {
            "requireAddress": true,
            "wakePhrases": ["assistant"],
            "followUpWindowMs": 8000
          },
          "maxVisionPerMinute": 30,
          "meetingRecap": true,
          "bilingual": true,
          "realtime": {
            "provider": "openai",
            "providers": {
              "openai": { "apiKey": "<key>", "model": "gpt-realtime" }
            },
            "instructions": "You are a helpful Teams meeting assistant.",
            "toolPolicy": "safe-read-only",
            "suppressInputDuringPlayback": true,
            "echoSuppressionWindowMs": 250,
            "echoBargeInRms": 0.02
          }
        }
      }
    }
  }
}
In streaming mode, TTS and the agent/model come from your OpenClaw configuration. STT uses a live transcription session - selected by stt.provider / stt.providers if set, else your openclaw-configured transcription provider; if none resolves it falls back to VAD-segmented file transcription. The realtime.* block is ignored except the echo-guard knobs (suppressInputDuringPlayback, echoSuppressionWindowMs, echoBargeInRms), which apply in both modes.

Outbound call-backs (optional, either mode)

"outbound": {
  "enabled": true,
  "workerBaseUrl": "https://<your-standin-endpoint>",
  "tenantId": "<aad-tenant-id>",
  "answerTimeoutMs": 120000,
  "defaultMode": "notify"        // "notify" delivers a message then ends; "conversation" opens a turn
}
placeCall(userObjectId, { message, mode }) is implemented on the runtime (no-answer / declined → voicemail / no-answer); the outbound block enables it.
workerBaseUrl is StandIn’s outbound API URL from your dashboard - not a server you host.

Key reference

KeyAppliesMeaning
enabledbothmaster on/off
modeboth"realtime" | "streaming" (auto if omitted)
port / bindAddress / pathbothmedia WebSocket server the StandIn bridge connects to
sharedSecretbothHMAC secret - must match the secret set in StandIn (secret input)
requireRecordingStatusbothonly engage once Teams reports recording active
inboundPolicybothdisabled | allowlist | pairing | open - enforced on inbound
allowFrombothallowlisted caller ids (Teams aadId or phone digits)
inboundGreetingbothopening line
sessionScopebothper-phone | per-call | per-thread agent-memory scope
maxConcurrentCalls / maxDurationSeconds / staleCallReaperSecondsbothcapacity + reaper
groupCall.{requireAddress,wakePhrases,followUpWindowMs}bothspeak-only-when-addressed gating
maxVisionPerMinutebothvision spend cap
meetingRecap / bilingualbothpost-call minutes / Arabic-English
realtime.{provider,providers,instructions,toolPolicy,…}realtime (echo knobs: both)realtime voice provider + behavior; provider key is a secret input
stt.{provider,providers}streaminglive transcription provider (else openclaw STT / file fallback); provider key is a secret input
outbound.{enabled,workerBaseUrl,tenantId,answerTimeoutMs,defaultMode}bothoutbound call-backs / voicemail
Treat sharedSecret and all provider apiKey values as secrets - keep them out of source control.

Microsoft Graph permissions

You bring your own Teams bot: register an Azure AD app + Azure Bot resource in your tenant, admin-consent the application permissions below, and point its calling webhook at StandIn (the URL is shown in your StandIn dashboard).
PermissionEnables
Calls.JoinGroupCall.Allanswer / join Teams calls and meetings
Calls.AccessMedia.Allaccess real-time Teams call audio/video media
Chat.Read.Allresolve chat / thread ids and read message context
ChatMessage.Read.Chatread messages in chats the bot is installed in
Sites.ReadWrite.Allupload files / minutes to SharePoint (OneDrive) for chat attachments
Calls.InitiateGroupCall.Alloutbound “call me back” (skip if unused)