Running a JMeter 5.6 load test to validate chat scalability on Genesys Cloud US1. The goal is to simulate 500 concurrent WebSocket connections for a single bot flow. After 10 minutes of steady state, connections start dropping unexpectedly.
Setup details:
- JMeter thread group: 500 threads, ramp-up 60 seconds, loop count 10.
- WebSocket sampler configured with keep-alive true.
- No custom headers, just standard OAuth Bearer token.
- Environment: Genesys Cloud US1 (prod sandbox).
Error observed:
At roughly 350 active connections, JMeter reports: “java.net.SocketTimeoutException: Read timed out” followed by “WebSocket connection closed abnormally with status code 1006”. The Genesys Cloud side shows no 5xx errors in the API logs, but the chat sessions terminate abruptly.
Checked the API rate limits for /api/v2/conversations, and we are well within the 100 requests per second limit. The issue seems specific to the WebSocket transport layer rather than REST API throttling.
Has anyone seen similar 1006 closures during high-concurrency chat tests? Is there a known connection pool limit per tenant or per IP address that might be causing this? We are using a single load generator VM, so it is possible the IP is being flagged for too many simultaneous WebSocket upgrades.
Also, the Architect flow is simple: just a greeting and a text response. No external integrations or long polling steps. The latency is low (<200ms) until the drop happens.
Looking for advice on:
- How to increase the WebSocket connection limit if it is enforced server-side.
- Whether distributing load across multiple IPs in JMeter would help bypass any per-IP connection caps.
- If there is a specific header or configuration needed to maintain long-lived WebSocket sessions under load.
Any insights from others who have stress-tested Genesys Cloud chat endpoints would be appreciated. The current setup fails consistently at the 350-400 connection mark, which is below our target scale.
WebSocket drops under load are rarely about the WFM side directly, but the timing correlation is a red flag. When you push a large schedule block to Genesys Cloud, the backend services re-evaluate agent availability and routing profiles. If your load test is running against the same org during this window, the sudden shift in available capacity can cause the WebSocket gateway to rebalance connections, leading to drops for clients that don’t handle the reconnect gracefully.
Check if your JMeter test is using static OAuth tokens. Genesys Cloud rotates tokens, and a 500-thread test holding onto a single token for 10 minutes will likely hit expiration or rate limits on the auth service, causing the gateway to sever connections. You need dynamic token refresh in your test script.
Here is a basic JMeter BeanShell pre-processor snippet to handle token refresh before each WebSocket connection attempt:
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
String url = "https://api.us1.genesyscloud.com/oauth/token";
String grantType = "client_credentials";
String clientId = vars.get("clientId");
String clientSecret = vars.get("clientSecret");
CloseableHttpClient client = HttpClients.createDefault();
HttpPost post = new HttpPost(url);
post.setHeader("Content-Type", "application/x-www-form-urlencoded");
post.setHeader("Authorization", "Basic " + Base64.getEncoder().encodeToString((clientId + ":" + clientSecret).getBytes()));
StringEntity entity = new StringEntity("grant_type=" + grantType);
post.setEntity(entity);
CloseableHttpResponse response = client.execute(post);
String body = EntityUtils.toString(response.getEntity());
String token = body.split("\"access_token\":\"")[1].split("\"")[0];
vars.put("accessToken", token);
Also, ensure your WebSocket sampler has a retry policy configured. Genesys Cloud expects clients to re-establish connections after brief outages. If the test doesn’t retry, it reports a failure, but the system might actually be handling the load fine. Try reducing the ramp-up to 120 seconds to see if the drops correlate with the auth service load spike rather than the WebSocket gateway itself.
Confirmed. Switching to exponential backoff in the JMeter WebSocket reconnection logic stabilized the 500-thread run. The drops were indeed due to the gateway rebalancing during the ramp-up phase. Adding a 2-second delay before reconnect prevented the 429s and kept the sessions alive through the steady state.
The WebSocket behavior mirrors the SIP 408 issues seen on AP-1 BYOC trunks during high concurrency. The gateway rebalancing is expected under load. Ensure your JMeter script handles the 101 Switching Protocols handshake correctly before sending messages. Also, verify that your OAuth tokens are not expiring mid-test, as stale tokens cause silent disconnects that look like drops.