Genesys Cloud EventBridge to Lambda: Concurrency limits during high-volume interaction bursts

Looking for advice on handling high-volume interaction events from Genesys Cloud EventBridge without hitting AWS Lambda concurrency limits.

Background

I am building a Teams bot that syncs Genesys Cloud presence with Microsoft Teams. We use EventBridge to capture interaction:created and interaction:updated events. These events are routed to a Python Lambda function (using the aws-lambda-powertools library) which processes the payload and updates the user’s status in Teams via the Graph API.

The setup works fine during normal hours. However, during peak call center shifts, we see bursts of 500+ events per second. Our Lambda function is configured with a reserved concurrency of 200. When the burst exceeds this, EventBridge drops events or retries aggressively, causing duplicate processing and eventual timeouts.

Issue

I am trying to implement a backpressure mechanism or a batching strategy to smooth out the spikes. I considered using an SQS queue as a buffer between EventBridge and Lambda, but I want to ensure I am not adding unnecessary latency to the presence sync.

Here is the current Lambda handler structure:

import json
from aws_lambda_powertools.utilities.parser import event_parser

def lambda_handler(event, context):
 # Parse EventBridge payload
 for record in event.get('detail', []):
 process_presence_sync(record)
 return {'statusCode': 200}

def process_presence_sync(interaction):
 # Call Graph API to update Teams status
 # Logic omitted for brevity
 pass

The error logs show Task timed out after 15.00 seconds and Unreserved concurrency limit reached.

Troubleshooting

  • Increased reserved concurrency to 500, but costs spiked and we still hit limits during extreme bursts.
  • Verified that the Lambda execution time is under 2 seconds per event.
  • Checked EventBridge destination settings; retry policy is set to 18 hours with exponential backoff.

Is there a best practice for buffering EventBridge events in this specific Genesys Cloud integration pattern? Should I move to SQS batch processing, or is there a way to throttle the event source from Genesys Cloud side via API?