Architect flow WebSocket timeout at 5000 concurrent threads

Is it possible to increase the Architect flow WebSocket keep-alive threshold during high concurrency? Running JMeter with 5000 concurrent users against ap-southeast-1 causes immediate 1006 close codes on /api/v2/architect/flows. The load pattern spikes CPU on the edge nodes, but the API throughput drops to zero without any 429 rate limit errors. Need a way to sustain the connection pool for load validation.

According to the docs, they say that Architect flow WebSocket connections are managed by the platform’s internal load balancers and do not expose configurable keep-alive thresholds via public APIs. The 1006 close codes during high concurrency usually indicate that the edge nodes are dropping idle connections to preserve resources, rather than a specific timeout setting issue.

For load validation against ServiceNow integrations, it is better to simulate the actual Data Action webhook payloads rather than holding open architect connections. This approach tests the REST API throughput directly, which is where the real bottleneck often lies.

Component Recommendation
Architect WebSockets Do not attempt to tune; use for UI/Flow logic only
ServiceNow Integration Test via Data Actions with realistic payload sizes
Concurrency Strategy Use async HTTP requests instead of persistent WS

A common fix is to structure the JMeter test to mimic the POST /api/v2/analytics/icd/events or the specific ServiceNow REST endpoint instead. This validates the actual integration path without stressing the architect engine unnecessarily. The documentation suggests focusing on the Data Action execution limits rather than connection persistence.

Check your Architect flow configuration for long-running Data Actions or excessive logging statements that may be holding connections open unnecessarily.

Is it possible to increase the Architect flow WebSocket keep-alive threshold during high concurrency? Running JMeter with 5000 concurrent users against ap-southeast-1 causes immediate 1006 close codes on /api/v2/architect/flows.

The platform manages WebSocket lifecycles automatically to ensure stability across the region. Forcing a higher threshold is not supported and may violate service level agreements. Instead of simulating raw WebSocket persistence, structure the test to mimic actual customer journeys. Use the Interaction Detail View to monitor how flows handle rapid succession requests. If the goal is capacity planning, focus on the maximum concurrent interactions metric in the Performance dashboard rather than connection longevity. This approach aligns with how the system processes real traffic. Adjusting the flow logic to minimize idle wait times often resolves these 1006 errors without requiring infrastructure changes.

You should probably look at at the architectural constraints of the Architect API when dealing with high-concurrency WebSocket connections, as the platform is not designed to sustain 5000 simultaneous interactive flow definitions. The previous point about internal load balancer management is accurate, but from a partner integration perspective, this behavior is often a protective measure against resource exhaustion rather than a misconfiguration. When building scalable integrations, relying on direct WebSocket persistence for bulk operations is generally discouraged due to the inherent latency and connection overhead. Instead, consider shifting the workload to asynchronous endpoints that can handle queueing and batch processing more efficiently. For example, if you are attempting to validate flow logic or trigger updates, using the standard REST API with proper exponential backoff is far more reliable than maintaining open WebSocket channels.

import requests
import time

def update_flow_async(flow_id, payload, retries=3):
 url = f"https://api.mypurecloud.com/api/v2/architect/flows/{flow_id}"
 headers = {
 "Authorization": "Bearer YOUR_ACCESS_TOKEN",
 "Content-Type": "application/json"
 }
 
 for attempt in range(retries):
 try:
 response = requests.put(url, json=payload, headers=headers)
 if response.status_code == 200:
 print(f"Flow {flow_id} updated successfully.")
 return True
 elif response.status_code == 429:
 wait_time = 2 ** attempt
 print(f"Rate limited. Retrying in {wait_time} seconds...")
 time.sleep(wait_time)
 else:
 print(f"Error: {response.status_code} - {response.text}")
 return False
 except Exception as e:
 print(f"Request failed: {e}")
 time.sleep(1)
 return False

Adopting this asynchronous pattern will prevent the immediate connection drops you are seeing while still allowing you to validate the API throughput under load.

The easiest fix here is this is… to stop treating WFM scheduling logic like a high-throughput API endpoint. We hit similar wall when bulk-publishing schedules. Ensure your agent_schedule_preferences aren’t triggering long-running lookups. The platform drops idle connections to protect the scheduler. Optimize the flow first.