Could someone explain why my Architect flow is timing out when triggered via API under heavy load?
Background
I am setting up a load test scenario to validate the platform’s ability to handle concurrent inbound API triggers. The goal is to simulate 500 concurrent requests hitting a specific Architect flow endpoint. The flow is designed to be lightweight: it receives the JSON payload, performs a simple variable assignment, and returns a 200 OK response. I am using JMeter with a custom plugin to manage the WebSocket connection and HTTP POST requests for the trigger.
The environment is a standard Genesys Cloud tenant. I have verified that the API rate limits are not being hit globally, as I am spreading the load over a 60-second window. The Architect flow is deployed in the prod environment. I am using the latest version of the Architect API documentation for the trigger definition.
Issue
When the concurrent load reaches approximately 150 requests per second, the Architect flow starts returning 504 Gateway Timeout errors. This is unexpected because the flow logic is minimal. The JMeter logs show that the HTTP POST request is sent successfully, but the response never arrives within the default timeout window. The WebSocket connection remains open, but no data is pushed back for these specific transactions.
I have checked the Architect flow logs, and I see that the flow starts executing but does not complete. There are no explicit error messages in the flow execution trace, just a termination after 30 seconds. This suggests that the flow is hanging somewhere, possibly waiting for a resource that is not available due to contention.
Troubleshooting
Increased the timeout in JMeter to 60 seconds. The error persists.
Reduced the concurrent load to 50 requests per second. The flow executes successfully with no timeouts.
Checked the API gateway logs. No 429 Too Many Requests errors are present.
Verified that the variable assignment in the flow is not referencing any external data sources or queues that might be blocked.
Ensured that the trigger definition in Architect is set to allow concurrent executions. The setting Allow Concurrent is enabled.
Is there a hidden limit on the number of concurrent Architect flow executions per tenant? Or is this a known issue with the API trigger mechanism under high load? Any insights into how to debug the hang in the flow would be appreciated.
The timeout behavior in Architect under high concurrency is often linked to how the platform handles session queuing and resource allocation for API-triggered flows. When 500 concurrent requests hit a single endpoint, the system attempts to instantiate flow sessions simultaneously. If the downstream actions (such as data lookups or external HTTP calls) are synchronous, they block the session thread, leading to a backlog that exceeds the default timeout threshold.
To mitigate this, review the flow configuration in the Architect interface. Ensure that any external integrations use asynchronous patterns where possible. For synchronous calls, increase the timeout value on the specific action blocks, but be aware that this only delays the failure if the underlying service cannot scale.
A more robust solution involves implementing a queue-based approach within the flow. Instead of processing all requests directly, route them through a Queue block with a sufficient capacity and appropriate wrap-up time settings. This allows the platform to throttle the processing rate to a manageable level, preventing resource exhaustion.
Additionally, check the Performance dashboard for “Flow Session Errors” and “Average Handle Time” metrics during the load test. A spike in session errors indicates that the platform is rejecting sessions due to resource constraints. Adjust the queue settings based on these metrics to balance throughput and stability.
This approach ensures that the flow can handle bursts of traffic without timing out, while providing visibility into system performance through the dashboard metrics.
Adjust the API trigger node settings to enable asynchronous processing instead of relying on synchronous session instantiation. In the Architect canvas, select the API trigger and navigate to the Advanced tab. Set the “Response Mode” to “Async” and ensure the “Queue Behavior” is configured to “Drop if Full” or “Wait with Timeout” based on your carrier SLA requirements. This prevents the SIP signaling stack from blocking while waiting for downstream HTTP lookups to complete. For high-concurrency scenarios involving BYOC trunks, the platform’s default session pool often saturates quickly. By offloading the immediate response, the flow continues executing in the background, allowing the API caller to receive a 202 Accepted status immediately. This decouples the ingress rate from the processing capacity. Monitor the “Active Sessions” metric in the Analytics dashboard; if it spikes above 80%, consider implementing a circuit breaker pattern in the flow to reject excess calls gracefully before they trigger a 503 or 408 timeout. This approach aligns with how we handle bursty traffic on our Singapore trunks.
The docs actually state that while switching to async mode is the standard fix for session blocking, it introduces a significant visibility gap for Workforce Management teams. When you decouple the trigger from the response, the WFM schedule adherence engine no longer has a clear timestamp for when the interaction “started” for the agent. This breaks the calculation for Occupancy and Shrinkage if the flow eventually hands off to a live agent.
If you are pushing 500 concurrent requests, you are likely hitting the maxConcurrentSessions limit defined in your org’s capacity planning. The default timeout is often 30 seconds, but under load, the queue wait time can exceed this before the session even begins execution.
Try adding a Get WFM Schedule data action at the very start of the flow to validate agent availability before proceeding with heavy logic. If the schedule shows available: false or inCapacity: false, return a 202 Accepted immediately and drop the request into a dead-letter queue or a separate low-priority processing flow. This prevents the WFM system from reporting false adherence metrics later.
Also, check your apiTriggerConfig in the flow settings. Ensure waitForResponse is set to false but logInteraction is true. This ensures the interaction is tracked for WFM reporting even if the API client disconnects.
We saw this exact issue last quarter when a team tried to integrate a high-volume chat widget. The async switch fixed the 502 errors, but their schedule adherence scores dropped by 15% because the system couldn’t reconcile the “start” time of the interaction with the agent’s login state. Always verify how your flow impacts WFM metrics before going live with high concurrency.