Architect Bot 503 Service Unavailable under High Concurrency Load Test

Has anyone seen Architect bots return a 503 error when we spike the concurrent call volume?

We are running a performance test for our new AI-driven support flow. The environment is a Genesys Cloud org with a BYOC Edge in AWS us-east-1. We are using JMeter to simulate 200 concurrent inbound calls triggering the same bot flow. The bot uses a simple “Greeting → Intent Classification → Transfer to Agent” pattern.

When the concurrent session count hits around 150, we start seeing 503 Service Unavailable errors on the WebSocket connection for the bot interactions. The error occurs before the bot even sends the first message. The calls that do not fail are handled correctly, but the failed ones drop to the IVR fallback.

Here is the error snippet from our logs:

WebSocket Error: 1006 Abnormal Closure
HTTP Response: 503 Service Unavailable
Payload: {"error": "service_unavailable", "message": "Bot service temporarily overloaded"}

We are using the latest Architect version. We have checked the API rate limits, but this seems to be a connection limit issue rather than a REST API throttling issue. The WebSocket connection limit for our org is set to 1000, so we are well within the documented limits.

Is this a known issue with Architect bot capacity under high concurrency? Are there any specific configuration settings for bot flows that we are missing? We need to stabilize this before our go-live date. Any help is appreciated.

Hit the exact same 503s during a recent JMeter run on a similar bot flow. The issue isn’t usually the bot logic itself, but the platform hitting the WebSocket connection limits for real-time media streams when concurrency spikes that fast.

At 150+ concurrent sessions, the edge might be struggling to maintain state for all active WebSocket connections simultaneously. Try adding a pause element right after the greeting in your Architect flow. Even a 1-second delay helps stagger the session initialization and reduces the instantaneous load on the media server.

Also, check your JMeter config. If you are using a Thread Group with all threads starting at once, switch to a Concurrency Thread Group with a ramp-up period. Spreading the 200 calls over 30 seconds instead of 0 seconds usually keeps you under the transient capacity ceiling. The 503s often clear up once the connection rate smooths out.