Why does this setting trigger rate limiting so quickly when testing concurrent user imports? I am running a JMeter script to simulate a large workforce logging in and updating availability simultaneously. The goal is to check if the system can handle 200 concurrent requests per second for availability updates.
The environment is Genesys Cloud Production. The load test uses the WFM Scheduling API endpoint /api/v2/wfm/scheduling/users/{userId}/availability. I set the JMeter thread count to 200 with a ramp-up time of 10 seconds. After about 5 seconds, the response time spikes, and I start seeing 429 Too Many Requests errors. The error message in the response body indicates that the rate limit for this tenant has been exceeded.
I checked the documentation, and it says the limit is 10 requests per second per user for this endpoint. However, I am distributing the load across 50 unique user IDs. So, theoretically, I should be able to send 500 requests per second (10 req/sec * 50 users). But the system blocks me after roughly 100 requests per second total.
Here is the JSON payload I am sending in the POST request body:
{
"startTime": "2023-10-25T09:00:00.000Z",
"endTime": "2023-10-25T17:00:00.000Z",
"type": "available",
"comment": "Load test update"
}
The headers include the correct Authorization: Bearer <token> and Content-Type: application/json. I am using the same access token for all requests in this test phase to simplify the setup, but I suspect this might be causing the issue. Is the rate limit applied per tenant globally, or is it per user ID? If it is per user, why am I hitting the limit so early?
I also noticed that the Retry-After header is sometimes missing from the 429 response, which makes it hard to implement a proper backoff strategy in JMeter. Can someone clarify how the WFM API rate limits work in a high-concurrency scenario? I need to know if I should be using multiple access tokens or if there is a different endpoint for bulk updates that has higher throughput limits.