Context:
Running a load test for Genesys Cloud Digital Messaging (SMS/Chat) using JMeter 5.6.2 from our Singapore staging environment. The goal is to validate throughput for a campaign expecting 200 concurrent inbound message bursts per minute.
The test script simulates users sending messages via the /api/v2/conversations/messages endpoint. I am using the platform API with OAuth2 token authentication. The JMeter thread group is configured with 50 threads, ramp-up of 5 seconds, and a loop count of 10. Each thread sends a message payload containing standard text and metadata.
Initially, the test runs smoothly for the first 100 requests. Around request 101, the error rate spikes dramatically. The majority of failures are 429 Too Many Requests with the following response body:
{
"errors": [
{
"code": "TOO_MANY_REQUESTS",
"message": "Rate limit exceeded. Please retry after 60 seconds."
}
]
}
I have checked the Retry-After header, which suggests waiting, but in a real-world scenario, dropping 50% of initial burst traffic is unacceptable. I tried adding a constant timer in JMeter to space out requests by 2 seconds, but this reduces the concurrency significantly and doesn’t reflect the actual burst pattern we are testing for.
I also noticed that some requests return 400 Bad Request with a message about “invalid conversation state” after the rate limiting kicks in, which suggests the backend might be rejecting messages for conversations that were partially processed or locked.
I am unsure if this is a hard limit on the POST /messages endpoint or if there is a specific batching mechanism I should be using. The documentation mentions rate limits but does not specify the exact threshold for digital messaging endpoints during peak load.
Question:
What is the correct way to structure the JMeter test to respect these rate limits without artificially throttling the concurrency? Are there specific headers or payload optimizations for the Digital Messaging API that help mitigate 429 errors during burst testing? Should I be using a different endpoint for high-volume message ingestion?