WFM Schedule Publishing Fails with 503 on Edge 2024.3.1 BYOC Nodes

Hey folks, running into a weird one this morning during our weekly publish window.

We are on Genesys Cloud Edge version 2024.3.1, managing a large cohort of agents in the America/Chicago timezone. Usually, our POST /api/v2/wfm/schedules/{scheduleId}/publish calls complete in under 15 seconds. Today, however, about 40% of our schedule publishes are timing out or returning a 503 Service Unavailable error specifically when the schedule includes agents assigned to our custom BYOC infrastructure nodes.

The error payload looks like this:

{
 "message": "Failed to propagate schedule updates to Edge node 'edge-node-04'.",
 "code": "edge_connection_timeout",
 "contextId": "ctx-9928a1b"
}

Interestingly, agents on standard Genesys Cloud infrastructure are publishing fine. I’ve checked the Edge node health dashboard, and ‘edge-node-04’ shows as healthy with low latency. I’ve also verified that the WFM API keys have the correct wfm:schedules:write scopes.

Has anyone seen this specific edge_connection_timeout during schedule publishing on the 2024.3.1 release? I’m wondering if there’s a new rate-limiting behavior on the Edge API gateway that’s conflicting with our bulk publish logic, or if this is a known issue with how the WFM service communicates with BYOC nodes in this version.

Any workarounds or known issues you can point me to would be a huge help. I’m currently stuck waiting on support ticket #449201, but I figured I’d check here first in case someone else is battling this.