Running Edge 2024.4.1 on bare metal Dell R750 nodes. Dual NIC setup handles WAN and local LAN. WAN link flapped last Tuesday. Edge automatically flipped to survivability mode. Local SIP phones registered fine. The AI voice bot flow in Architect started dropping audio packets immediately. Console shows the media bridge timing out. Bot session stays active in the cloud but the local node can’t push the PCM stream back.
Checked the Edge logs. The WebSocket connection to the bot orchestration endpoint keeps resetting.
{
"timestamp": "2024-08-12T09:14:22Z",
"component": "media-bridge",
"level": "ERROR",
"message": "Bot audio stream handshake failed. WSS connection reset by peer.",
"error_code": "BOT_MEDIA_TIMEOUT_502",
"session_id": "bs_88a7c2d1",
"edge_node": "tokyo-edge-01"
}
BIOS settings are locked to default. Network routing table looks clean. You don’t need to touch the firewall rules. The bot flow uses the standard connector. Audio works perfectly when WAN is stable. Once the local node takes over, the bot media path breaks doing jack all. Switching the bot to a plain SIP trunk bypasses the issue. The mic stays hot but the bot just stutters. Logs repeat that handshake error every two seconds. No idea where the media bridge is routing the packets.
Look, I don’t do much on-prem SIP, but if you’re trying to debug this from the API side, you’re probably hitting a wall with the media bridge metrics. The .NET SDK doesn’t expose the low-level PCM stream status directly in the conversation object. It just shows “active” or “connected”.
You need to query the analytics endpoint for that specific time window to see where the packets actually stopped flowing. If the Edge is in survivability mode, the media path changes completely. The bot session stays in the cloud, but the media might be trying to route through a path that’s now dead.
Here’s how I’d pull the details in C# to see if it’s a network timeout or a bot logic error:
var client = new PureCloudPlatformClientV2();
client.AuthSettings.AccessToken = "your_token"; // or use oauth flow
var analyticsApi = new AnalyticsApi(client);
var query = new QueryConversationDetailsRequest
{
View = "summary",
TimeFilter = new TimeFilter
{
Type = "relative",
RelativeTo = "now",
RangeStart = "-PT1H" // last hour
},
Selectors = new List<ConversationSelector>
{
new ConversationSelector
{
Type = "conversation",
Id = "your_conversation_id" // get this from your active call list
}
}
};
var response = await analyticsApi.PostAnalyticsConversationsDetailsQuery(query);
Check the mediaEvents in the response. Look for media_timeout or media_restart. If you see those, it’s not the bot, it’s the Edge media server losing the handshake with the cloud orchestrator.
Don’t ignore the mediaServer field in the analytics response. It tells you which Edge node handled the media. If that node changed mid-call, you’ll see a handoff event. If the handoff failed, the audio dies.
I usually wrap this in an Azure Function that triggers on v2.conversation.media.event. That way you get notified in real-time when the media path breaks, instead of digging through analytics later. The docs say: “Media events are emitted when the media state changes.” But they don’t tell you that the event payload is sparse during survivability failovers. You’ll have to cross-reference with the Edge system logs.