AI Voice Agent — Intelligent Call Orchestration

What We Built

An end-to-end voice AI integration that seamlessly bridges an AI agent to a live agent through AWS infrastructure. A caller dials into AWS Connect, is routed to the AI agent via Chime SMA, and when transfer is needed, the AI agent hands the call back to a live agent queue — all within the same PSTN call.

LEG-1 — Inbound to AI Agent

AWS Connect flow plays "Transfer you to Willow" then forwards the call to +13124420260 (Chime SMA). Lambda processes the SIP event, logs it in DynamoDB, and bridges the call to the LiveKit SIP server where the AI agent answers.

AWS Connect → Transfer to +13124420260 → Chime SMA → Lambda + DDB → LiveKit SIP → AI Agent

LEG-2 — Transfer to Live Agent

When the AI agent determines a transfer is needed, it triggers a call transfer via the LiveKit SIP API. Lambda receives the event, uses CallAndBridge to connect to the AWS Connect live agent queue. The caller hears "Hello, This is Prince William County 311 Team. How can I help you today?"

AI Agent → Call Transfer → Chime SMA → Lambda + DDB → CallAndBridge → AWS Connect (Live Agent)

Baseline Performance Metrics

Measured from the integration test environment. These are the target baselines for production deployment.

TTFA (LEG-1)

< 3s

Target: under 1.5s

CTD (LEG-2)

< 5s

Target: under 2s

E2E

~20s

Full call cycle

Roadmap — Next Steps

DONE End-to-end call flow — AI agent answers, transfers to live agent via Chime SMA + Lambda + DDB
DONE Call log journal in DDB — Every call event timestamped: INVITE, ANSWER, TRANSFER, HANGUP
DONE Performance baseline — TTFA, CTD, E2E measured and tracked per call
NEXT Hours of Operation — Check AWS Connect calendar before transfer; after-hours calls handled fully by AI
NEXT Live Agent Availability — Query CCP agent status (Active/Inactive) before transfer; inform caller if no agents available
FUTURE Agent routing logic — Skills-based routing, queue priority, estimated wait time announced by AI
NEXT Define concurrent-call capacity — Size Production vs UAT quotas (UAT currently capped at 10); model peak-hour volume and set per-environment limits
NEXT Raise AWS quota tickets — Request increases for AWS Connect active-call limits and Chime SDK SMA concurrent-call limits for both Production and UAT accounts
NEXT Chime SDK call guardrails — Implement total-active-call tracking in DDB; reject or queue new SMA calls when at capacity; evaluate CloudWatch Metric Alarms (CMA) for threshold alerts
NEXT Automated health-check probes — Run integration test as a scheduled probe every N hours; verify end-to-end call flow, alert on failure via SNS/PagerDuty

Lambda — Call Orchestration Engine

The Lambda function is the brain of the call flow. It processes every Chime SMA event and decides what action to take.

NEW_INBOUND_CALL — Accept the call, optionally play ringback, bridge to LiveKit SIP
ACTION_SUCCESSFUL (CallAndBridge) — Log that LEG-1 or LEG-2 is connected
CALL_UPDATE_REQUESTED — Agent wants to transfer → initiate CallAndBridge to AWS Connect
HANGUP — Clean up DDB record, determine which leg ended and whether to terminate the other

DynamoDB — Call State + Journal

DDB serves two critical roles:

Call State Tracking — TransferInProgress, PendingTransferToPstn fields tell Lambda where each call is in the flow
Call Log Journal — Every event is timestamped: LastEventType, LastEventTime, TransactionId
Active Call Monitoring — Scan DDB to get real-time counts of SMA active, Connect active, and in-flight calls

Table: CallForwardingLiveKitOnlyStack-SmaCallTracking

PK: TransactionId (Chime call ID)

Fields: LastEventType, LastEventTime, TransferInProgress, PendingTransferToPstn, LegA/LegB CallId

LEG-1 Architecture — Inbound

Caller (PSTN) → AWS Connect → Transfer +13124420260

↓

Chime Voice Connector → Chime SMA → Lambda

↓ DDB: write NEW_INBOUND_CALL

LiveKit SIP Server → LiveKit Room → AI Agent (TTS)

LEG-2 Architecture — Transfer

AI Agent → LiveKit SIP API → Chime SMA

↓ CALL_UPDATE_REQUESTED

Lambda → CallAndBridge → AWS Connect

↓ DDB: update PendingTransferToPstn

Live Agent Queue → "Hello, This is PWC 311 Team..."

Production Additions

NEXT Hours of Operation Check — Lambda queries AWS Connect Hours of Operation API before CallAndBridge. If closed → AI agent handles the call or leaves a callback.
NEXT Agent Availability Check — Lambda queries Connect GetCurrentUserData API for agent status in CCP. If no agents Active → AI informs caller of wait time or offers callback.
FUTURE DDB Journal Analytics — Dashboard for call volume, avg TTFA/CTD, transfer success rate, after-hours volume.

Phone Verification Required

Enter your phone number to receive a verification code via SMS. You must verify before connecting to the AI agent.

LEG-0 WebRTC Sim Call

WebRTC → Call Transfer → Voice Connector Direct → AWS Connect

(Flow to Transfer to Willow)

LEG-1 Inbound to Agent

AWS Connect → Chime SMA (Lambda + DDB)

→ LiveKit SIP/Server → Agent

LEG-2 Transfer to Live Agent

Agent → Call Transfer → Chime SMA (Lambda + DDB)

→ AWS Connect (Live Agent Simulation)

Ready

00:00

CALL PERFORMANCE METRICS

Metric	Measured	Value

ACTIVE CALL COUNT

LK Rooms —

SIP Calls —

Web Users —

Connect Active —

SMA Active —

DDB In-Flight —

Polling every 10s...

Abbreviation Reference

Quick-reference glossary for every abbreviation used across this application.

Call Flow & Telephony

SIP	Session Initiation Protocol — signaling protocol for establishing, modifying, and terminating voice/video calls
PSTN	Public Switched Telephone Network — the global circuit-switched phone network
RTP	Real-time Transport Protocol — carries audio/video media payloads over UDP
SRTP	Secure RTP — encrypted version of RTP used in WebRTC and secure VoIP
DTMF	Dual-Tone Multi-Frequency — touch-tone signals (0-9, *, #) sent in-band or via SIP INFO/RTP events
IVR	Interactive Voice Response — automated phone menu system ("Press 1 for…")
DID	Direct Inward Dialing — a phone number that routes directly to a specific destination without a switchboard
LEG-1	First call segment: Caller → AWS Connect → Chime SMA → LiveKit SIP → AI Agent
LEG-2	Second call segment: AI Agent → Chime SMA → AWS Connect → Live Agent (silent transfer)
CCP	Contact Control Panel — AWS Connect agent desktop for handling calls

AWS Services

SMA	SIP Media Application — AWS Chime SDK component that invokes Lambda on each call event
SDK	Software Development Kit — AWS Chime SDK provides programmable voice/video APIs
VC	Voice Connector — AWS Chime component that bridges SIP trunks to/from external networks
DDB	DynamoDB — AWS NoSQL database used for call-state tracking and event journaling
SNS	Simple Notification Service — AWS pub/sub messaging for alerts (email, SMS, PagerDuty)
CMA	CloudWatch Metric Alarms — threshold-based alerts on AWS metrics (e.g. active call count)
PK	Partition Key — the primary key used to distribute and look up items in DynamoDB

Performance Metrics

TTFA	Time To First Audio — elapsed time from SIP call answered to first TTS audio heard by caller
CTD	Call Transfer Delay — elapsed time from agent transfer request to LEG-2 connected
E2E	End-to-End Duration — total call time measured on the phone system (SIP answered → call ended)

WebRTC & Media

WebRTC	Web Real-Time Communication — browser API for peer-to-peer audio/video/data
ICE	Interactive Connectivity Establishment — protocol that finds the best network path between peers
STUN	Session Traversal Utilities for NAT — discovers a host's public IP for NAT traversal
TURN	Traversal Using Relays around NAT — relay server for when direct peer connection fails
NAT	Network Address Translation — maps private IPs to public IPs at the router
SFU	Selective Forwarding Unit — LiveKit's media server that routes audio/video tracks between participants

AI & Agent

TTS	Text-to-Speech — converts agent text responses into spoken audio (Deepgram)
STT	Speech-to-Text — transcribes caller speech into text for the LLM
LLM	Large Language Model — the AI model that generates agent responses (e.g. GPT-4o)
VAD	Voice Activity Detection — detects when a speaker starts/stops talking for turn-taking

Infrastructure & Operations

UAT	User Acceptance Testing — pre-production environment for validation (current call limit: 10)
API	Application Programming Interface — programmatic endpoints for service interaction
JWT	JSON Web Token — signed token used for LiveKit room authentication
OTEL	OpenTelemetry — observability framework for distributed traces, metrics, and logs
CI/CD	Continuous Integration / Continuous Deployment — automated build, test, and deploy pipeline

AI Voice Agent Platform