Auto-Scaling Lambda Architectures for Voice Traffic Spikes

I am a PureConnect admin transitioning to Genesys Cloud and I am currently designing our new inbound voice architecture. We have a very volatile traffic pattern where we can go from ten calls to five hundred calls in a few seconds during a marketing event. We are using AWS Lambda for our data dips and I am worried about the cold start latency and the concurrency limits. How should I architect our Lambda backend to ensure it can handle these sudden spikes without causing timeouts in our Architect flows?

Hey Ais71. Welcome to the cloud! I handle the network engineering for our remote agents and I have seen many people trip over this. You must use ‘Provisioned Concurrency’ for your critical data dip Lambdas. This keeps a set number of execution environments warm and ready to go. Also, make sure your Lambda has enough memory allocated; it is not just about the RAM, AWS also scales the CPU performance based on the memory setting. A 128MB Lambda will be much slower to start than a 1GB one.

I am a reporting analyst and I have seen these spikes show up in our executive dashboards as ‘Flow Errors’ when the Lambda fails. To follow up on Che75, you should also implement an ‘Exponential Backoff’ and a ‘Circuit Breaker’ pattern in your Architect flow. If the first Lambda call fails, the flow should wait a few hundred milliseconds before trying again, or fall back to a default behavior if the service is completely overwhelmed. It is better to have a slightly degraded experience than a total call failure.

Correct! And do not forget about the Genesys Cloud side. You should check the ‘Data Action’ timeout settings in your org. If your Lambda is slow, the Data Action will time out before it gets a response. I also recommend that you use a dedicated AWS account for your contact center integrations so you do not share your Lambda concurrency limits with other corporate applications. It is a simple way to avoid ‘Noisy Neighbor’ problems during your peak traffic events.