Struggling to understand why the Genesys Cloud Architect IVR flow returns HTTP 503 Service Unavailable when simulating 200 concurrent inbound calls via JMeter 5.4.1. The setup involves a simple IVR menu that routes calls to a queue after a 5-second delay. Below are the details of the environment and the steps taken so far.
Background
The goal is to validate the call handling capacity of our Genesys Cloud instance (US-East region) under load. The environment uses the default WebRTC softphone client for testing, but the load generator pushes SIP INVITE requests directly to the Genesys Cloud edge via a test tenant configured with 500 concurrent call capacity. The Architect flow is minimal: Answer Call → Play Prompt (5s) → Route to Queue. The queue has a standard strategy with no complex routing rules. JMeter is configured to ramp up 200 threads over 60 seconds, holding the load steady for 5 minutes. The test script uses the Genesys Cloud REST API to initiate calls programmatically to avoid browser-based overhead, leveraging the /api/v2/architect/flows endpoint to trigger the flow execution context.
Issue
At approximately 150 concurrent sessions, the error rate spikes. Instead of connecting to the queue, the calls fail immediately with a 503 error. The response body contains a generic message: “Service temporarily unavailable. Please try again later.” The JMeter logs show the failure occurs right after the initial INVITE is accepted but before the SDP answer is exchanged. This suggests the issue might be at the WebSocket layer or the Architect flow engine itself, rather than the SIP trunk. The error persists even when the concurrent session count is reduced to 100, indicating a potential threshold issue or a configuration limitation in the test tenant.
Troubleshooting
- Verified that the test tenant has sufficient call capacity and no active maintenance windows.
- Checked the Architect flow logs; no errors are logged for the failed calls, suggesting the flow engine never processes them.
- Tested with a smaller load of 50 concurrent calls, which works perfectly with no errors.
- Confirmed that the API credentials used by JMeter have the necessary permissions to initiate calls and access the flow.
- Reviewed the WebSocket connection limits; the test script maintains persistent connections, but no explicit limit is being hit according to the dashboard.
Any insights into why the 503 errors occur at this specific concurrency level would be appreciated. Is there a hidden rate limit on flow executions or WebSocket handshakes that is not documented?