Just noticed that our load test scripts are failing hard when we push past 5000 concurrent sessions in a single Architect flow. the websocket connections to the media server seem to be getting dropped randomly with a 1006 close code after about 30 seconds of silence. we are using the standard platform API to inject calls via JMeter and the flow is super basic just a transfer step. but once the volume hits that threshold the latency spikes and then the connections just vanish. i checked the server logs and see no explicit 503 or 429 errors from the API gateway itself. is there a hard limit on websocket connections per flow or per organization that we are hitting? the documentation mentions rate limits for REST calls but says very little about websocket capacity planning for high concurrency. also seeing some intermittent 504s on the analytics endpoint when trying to query real time stats during the spike. anyone else see this behavior when testing high volume IVR flows? we need to know if this is a config issue or a platform limit before our production rollout next week. running in APAC region by the way.
Have you tried adjusting the idle_timeout in your SIP trunk configuration? With 15 BYOC trunks, carrier-specific keep-alive intervals often conflict with Genesys default settings. Check if session-timers are enabled and if the refresher role matches your carrier’s expectations to prevent premature 1006 closures.
It’s worth reviewing at how schedule publishing impacts media server resources. While this is a WFM thread, high concurrency during Tuesday 09:00 CST publishing often strains infrastructure.
- Verify if load tests align with schedule publish windows.
- Check WFM API rate limits causing downstream bottlenecks.
- Ensure agent self-service isn’t triggering unnecessary reconnections.