Analytics API 503 Service Unavailable during JMeter Load Test on US1

Trying to understand the rate limiting behavior for the analytics export endpoints when pushing high concurrency. The environment is Genesys Cloud US1. We are using JMeter 5.6 to validate the capacity of the reporting engine before moving to production. The goal is to determine the maximum sustainable throughput for generating interaction summary reports.

The setup involves a service account with full admin permissions. We are targeting the endpoint /api/v2/analytics/interactions/summary/queued. The test plan uses 50 concurrent threads with a ramp-up period of 10 seconds. Each thread sends a POST request with a specific date range filter.

The issue starts immediately after the ramp-up phase. Approximately 60% of the requests return HTTP 503 Service Unavailable. The error message in the response body indicates that the service is temporarily overloaded. We have verified that the service account has the correct roles and permissions. The same payload works fine when sent sequentially.

We suspect this might be related to the internal queue depth of the analytics service or a specific rate limit for export jobs. We have not seen any 429 Too Many Requests errors, which is unusual. The 503 errors appear randomly across different threads.

Here is the JSON payload we are sending in the JMeter request body:

{
 "date_from": "2023-10-01T00:00:00.000Z",
 "date_to": "2023-10-01T01:00:00.000Z",
 "groupBy": ["queueId"],
 "filters": {
 "queueIds": [
 "queue-id-1",
 "queue-id-2"
 ]
 },
 "metrics": ["queueWaitTime", "handleTime"],
 "interval": "PT1H"
}

The response time for successful requests is around 200ms. For the failed requests, it is less than 50ms. We are not using any caching mechanisms. The JMeter server is located in the same region as the US1 environment. We want to know if there is a hard limit on concurrent export jobs per account. We also need to know if there is a way to increase this limit or if we should implement a retry mechanism with exponential backoff. Any insights on the internal throttling logic would be appreciated. We are trying to model the accurate capacity for our reporting dashboard backend.