Trying to understand the exact rate limit behavior for the /api/v2/analytics/conversations/details/query endpoint when pushing high concurrent connections through our Singapore BYOC environment. We are running a JMeter script simulating 200 concurrent bot interactions to validate connection capacity and API throughput under load. The test setup involves a simple Architect bot flow with no complex integrations, just basic intent matching and response generation. However, we are hitting persistent 429 Too Many Requests errors after only 50 concurrent connections. The error message is quite specific:
{ "errors": [ { "code": "rate_limit_exceeded", "message": "You have exceeded the rate limit for this endpoint. Please try again later.", "moreInfo": "https://developer.genesys.cloud/rate-limits" } ] }
This is unexpected given that the official documentation suggests a higher threshold for platform API queries. We have verified that our OAuth tokens are valid and not being refreshed excessively. The JMeter configuration is straightforward, using a Thread Group with 200 threads, a ramp-up period of 10 seconds, and a loop count of 10. Each thread makes a single API call to query conversation details. The latency before the 429 error is consistently around 200ms, which indicates the issue is not network-related. We are using the latest version of the Genesys Cloud Java SDK for API calls. Has anyone else encountered similar rate limiting issues with the conversation details endpoint during high-concurrency load tests? We are particularly interested in understanding if there are different rate limits for BYOC environments compared to public cloud instances. Any insights or workarounds would be greatly appreciated. We are trying to determine if this is a hard limit or if there is a way to request a temporary increase for our load testing purposes. The impact on our project timeline is significant, as we need to validate our bot’s performance under peak load conditions before the next release. We have also tried staggering the API calls with a 1-second delay between threads, but the 429 errors persist, albeit less frequently. This suggests that the rate limit is quite strict. We are looking for best practices or official guidance on how to handle high-concurrency scenarios with the Genesys Cloud platform API. Any help or direction would be much appreciated.