We’ve got a Kotlin backend processing Genesys Cloud interaction events via EventBridge, and we’re hitting the concurrency limit on our Lambda function during peak hours. The setup uses an EventBridge rule to route com.genesis.interaction.created and com.genesis.interaction.updated events to a Lambda. We’re seeing roughly 2k events per minute during campaign spikes.
The Lambda is configured with a reserved concurrency of 100, but we’re seeing TooManyRequestsException errors in CloudWatch logs when the queue depth exceeds that. We need to process these events to update our Android app’s real-time dashboard via WebSockets, so latency is a concern.
Here’s the basic structure of the Lambda handler in Kotlin:
class InteractionProcessor :
RequestHandler<Map<String, Any>, Void> {
override fun handleRequest(input: Map<String, Any>, context: Context): Void {
val detail = input["detail"] as? Map<String, Any> ?: throw IllegalArgumentException("Missing detail")
val interactionId = detail["interactionId"] as? String ?: return
// Process event
processInteraction(interactionId)
return null
}
}
The processInteraction method makes a call to our internal service to fetch additional customer data and then pushes a WebSocket message. The issue is that the Lambda execution time averages around 2 seconds due to the data fetch, which bottlenecks the concurrency.
We’ve tried increasing the batch size on the EventBridge target, but that just leads to larger payloads and longer processing times per batch. We’re considering switching to a SQS queue as a buffer, but we want to avoid adding another hop if possible.
Is there a way to configure the EventBridge rule to throttle the event rate to the Lambda without losing events? Or should we be using a different pattern for high-volume event processing? We’re also seeing some duplicate events in our logs, which complicates the deduplication logic on the consumer side.
The error log looks like this:
com.amazonaws.services.lambda.model.TooManyRequestsException: Function failed on too many concurrent executions.
Any ideas on how to handle this volume without scaling the Lambda to hundreds of concurrent instances, which gets expensive fast?