Genesys Cloud Webhook returning 500 from Python consumer - handling dead letter queue logic

We have a Python FastAPI service subscribed to the routing:interaction:updated event stream. The goal is to write state changes to our internal database. The issue arises when the DB is temporarily unreachable, causing the webhook endpoint to return a 500 Internal Server Error.

Genesys Cloud retries the delivery, but our current retry logic is naive and just crashes again. We need to implement a proper dead letter queue (DLQ) pattern. If the payload fails to persist after three internal retries, we want to push it to an SQS queue for later processing rather than blocking the Genesys retry loop indefinitely.

Here is the current handler structure:

@app.post("/webhook/genesys")
def handle_interaction(payload: dict):
 try:
 # Parse payload
 interaction_id = payload['interaction']['id']
 
 # Attempt DB write
 db.save_interaction(interaction_id, payload)
 return {"status": "ok"}
 
 except DatabaseError as e:
 # This is where it gets messy
 logger.error(f"DB write failed for {interaction_id}: {e}")
 raise HTTPException(status_code=500, detail="DB unavailable")

The problem is that raising the 500 error triggers Genesys’s exponential backoff, which floods our logs. I want to catch the DatabaseError, attempt a local retry with a short delay, and if that fails, send the raw payload to SQS and return a 200 OK to Genesys to stop the retry cycle.

How should the exception handling block look to ensure we acknowledge the webhook to Genesys while still preserving the failed payload? I’m concerned about race conditions if Genesys retries while we are writing to the DLQ. Is there a standard pattern for this in the CX-as-Code community?