Web Messaging SDK 503 errors during high concurrency

Looking for advice on handling intermittent 503 Service Unavailable responses from the /api/v2/conversations/messaging endpoint when scaling Web Messaging widgets.

We are deploying a Premium App that aggregates chat data across multiple orgs using the latest v1.8.4 SDK. Under load tests exceeding 50 concurrent sessions per minute, the API returns 503s specifically when attempting to create new message events via the REST interface rather than WebSocket. The X-RateLimit-Remaining header shows sufficient capacity, suggesting this is not a standard rate-limiting issue. Are there known throttling policies for high-volume message creation in Architect flows that bypass standard rate limits, or should we implement a specific retry logic with exponential backoff on the client side before the traffic hits the Genesys Cloud edge?