WFM Capacity API 429 errors during high-concurrency JMeter load test in US1

Could someone clarify why the Workforce Management Capacity API is returning 429 Too Many Requests when JMeter 5.6 simulates 100 concurrent requests to the /api/v2/wfm/scheduling/groups endpoint? We are running this load test in the US1 region to validate our capacity planning integrations before a major deployment. The goal is to understand how the API handles burst traffic during peak scheduling updates.

The test setup uses a simple thread group with 100 threads, a ramp-up period of 10 seconds, and a loop count of 5. Each thread sends a GET request to fetch scheduling groups for a specific team. The request includes a valid OAuth2 access token with the wfm:scheduling:read scope. We are using the Genesys Cloud SDK v2.4 for Java to handle authentication, but the actual HTTP requests are made directly via JMeter’s HTTP Request sampler to isolate network and API performance.

The issue starts appearing after the first 30 seconds of the test. Initially, all requests return 200 OK. But as the concurrency ramps up, we start seeing 429 errors with a retry-after header set to 5 seconds. The error message in the response body says “Rate limit exceeded for this client.” We have checked the API documentation, and the default rate limit for this endpoint is 100 requests per minute per client. However, our test is well within this limit, averaging only 600 requests per minute across all threads.

We have also verified that the access token is not being invalidated or rotated during the test. The same token is used for all requests. We have tried adding a 100ms delay between requests, but the 429 errors persist. We are also seeing increased latency on successful requests, with average response times jumping from 200ms to over 2 seconds.

Has anyone else encountered this issue during WFM API load testing? Are there specific headers or parameters we need to include to avoid rate limiting during high-concurrency tests? We are trying to ensure our integration can handle sudden spikes in scheduling updates without failing.

Thanks for the help.