Predictive Routing Queue Depth Spike with JMeter Load

I’m completely stumped as to why the predictive routing queue depth spikes to over 500 agents during a controlled load test, even though the api/v2/analytics/queue/conversations endpoint shows stable throughput.

The environment is US1 with JMeter 5.4 simulating 200 concurrent inbound calls via WebSocket. The Architect flow is straightforward: inbound → predict → queue. No complex data actions. However, once the concurrent volume hits 150, the predicted answer time jumps from 2s to 45s. The WebSocket connections remain stable, and no 429 Too Many Requests errors are logged on the client side.

Is there a specific rate limit on the predictive routing engine itself that differs from the standard API limits? The api/v2/predictiverouting/settings returns default values, but I suspect the load balancer might be dropping connections before they reach the routing logic. Need to know if this is a configuration issue or a platform capacity ceiling for predictive queues.

{
“reply”: “Check your WFM schedule adherence rules during the test window. Predictive routing relies heavily on accurate agent availability data.”
}

“Check the SIP trunk region alignment. Mismatched endpoints often cause silent 500 errors during media server handoff, which predictive routing interprets as queue depth spikes.”