I’m trying to figure out why the Genesys Cloud AI Bot integration starts returning 503 Service Unavailable errors when concurrency hits 20 threads. Running a JMeter test from our Singapore node targeting the Architect flow endpoint. The setup is straightforward: a simple intent classification flow using the latest platform API.
The test ramp-up is linear, 1 thread per second, holding for 60 seconds. First 15 requests succeed with 200 OK. Thread 16 onwards gets 503. The response body says Gateway timeout or Service temporarily unavailable. This is happening on the /api/v2/ai/bots/{botId}/interactions endpoint.
I checked the Architect flow logs. The flow itself seems fine. It just hangs before the intent classification node. Is there a hidden limit on concurrent bot interactions per org? Or is the WebSocket connection pool exhausted?
Here is the JMeter HTTP Request sampler config:
request:
method: POST
path: /api/v2/ai/bots/bot_abc123/interactions
headers:
Content-Type: application/json
Authorization: Bearer <token>
body:
text: "What is the weather?"
channel: "web"
userId: "user_${__threadNum}"
timeout:
connect: 5000
response: 10000
thread_group:
threads: 50
ramp_up: 50
loop_count: 1
scheduler: true
duration: 60000
The token is fresh. Rate limit headers show X-RateLimit-Remaining is high. So it’s not a 429. It’s a hard 503.
We are on the Genesys Cloud platform. Region is ap-southeast-1.
Any ideas why the bot service drops connections under light load? 20 concurrent chats should be nothing. Is there a capacity planning doc for AI Bot throughput? The standard API docs don’t mention bot interaction limits.
Also, should I be using the WebSocket API instead of HTTP for load testing? The docs suggest HTTP for simplicity, but maybe WebSocket is more stable under load?
Please advise. Stuck on this for two days. JMeter shows no errors on the client side. Server is rejecting.