WebRTC socket drops at 2k concurrent users in Genesys Cloud

Trying to understand why WebSocket connections for our softphone load test are dropping randomly once we hit around 2,000 concurrent users. The environment is Genesys Cloud PureCloud EU-West. We are using the latest JavaScript SDK (v2.5.1) integrated into a custom React wrapper for JMeter via WebDriver. The setup involves a simple inbound flow with no complex routing, just direct agent assignment. We are seeing intermittent WebSocket closed abnormally errors in the browser console, followed by the SDK attempting to reconnect. The reconnect logic seems to work, but there is a noticeable latency spike of about 800ms during the reconnection phase, which ruins our CSAT simulation metrics.

We have verified that the API rate limits are not being hit, as the 429 errors are absent from our logs. The issue appears strictly on the media plane. We are using STUN servers provided by Google and have configured TURN servers through Twilio for NAT traversal. The JMeter test runs from a single EC2 instance in Singapore to simulate a localized BPO environment. The WebSocket handshake completes successfully, and the initial media negotiation (SDP exchange) works fine. However, after approximately 15 minutes of steady state, the connections start failing. We are using the genesys-cloud-webrtc npm package directly.

Has anyone seen similar behavior with high-concurrency WebRTC tests? We tried increasing the maxReconnectAttempts in the SDK config, but that just delays the failure rather than preventing it. We also checked the Genesys Cloud activity logs and see no corresponding errors on the backend. The issue persists even when we reduce the video resolution to 240p to minimize bandwidth usage. We suspect it might be related to how the SDK handles heartbeat intervals under load, but we are not sure. Any pointers on debugging the WebSocket lifecycle in the SDK or known limits on concurrent WebRTC sessions per org would be helpful. We are currently blocked on our capacity planning report due to this instability.

The documentation actually says WebRTC relies on UDP for media, so checking WebSocket stability won’t solve the drop issue. Focus on your carrier’s RTP timeout settings and ensure the BYOC trunk isn’t dropping idle sessions prematurely.