Notification API WebSocket reconnects killing my event stream

How do I keep the Genesys Cloud Notification API WebSocket alive without missing events during the reconnect handshake?

We’re building a real-time dashboard in New Relic that ingests conversation:updated events. I’m using the standard WebSocket endpoint wss://api.mypurecloud.com/api/v2/notification/events with a Bearer token in the query string.

The connection works fine for the first hour or so. Then, inevitably, the connection drops. I’ve got a basic reconnect loop in Python using the websocket-client library. The issue is that when the socket drops and I immediately reconnect, I get a burst of duplicate events or miss the ones that happened during the gap. The server seems to require a specific lastEventId to resume, but the docs are vague on how to track this reliably across a drop.

Here’s the core of my reconnect logic:

def on_message(ws, message):
 global last_id
 data = json.loads(message)
 last_id = data.get('id', last_id)
 process_event(data)

def connect():
 uri = f"wss://api.mypurecloud.com/api/v2/notification/events?access_token={TOKEN}&lastEventId={last_id}"
 ws = websocket.WebSocketApp(uri, on_message=on_message, on_close=on_close)
 ws.run_forever()

When on_close triggers, I call connect() again. But sometimes the lastEventId I pass is rejected with a 400 Bad Request saying the event ID is too old or invalid. Other times, I just get a silent drop and no new events until I force a full re-auth.

I’ve tried adding a heartbeat frame every 15 seconds, but that doesn’t seem to stop the server from closing the connection during quiet periods. Is there a specific header or query param I’m missing to make the subscription sticky? Or is the lastEventId mechanism fundamentally broken for high-churn environments?

The latency spikes in NR are directly correlated with these reconnect events, so I need a stable stream.