Impact of Audio Quality and Jitter on Voice Bot NLU Accuracy

Hello everyone. I am a network engineer and I am currently investigating a low ‘NLU Confidence’ issue with our voice bots. We are seeing that agents who are working from home with poor internet connections are producing very choppy audio, which causes the bot to misinterpret their utterances. Does Genesys Cloud have a recommended minimum bandwidth or ‘Jitter’ threshold for optimal bot performance, and is there a way to ‘Pre-Process’ the audio to reduce background noise before it hits the NLU engine?

Greetings! I am also a network engineer and I deal with voice quality every day. For voice bots, you should aim for a jitter of less than twenty milliseconds and zero packet loss. The NLU engine is very sensitive to any ‘Audio Artifacts’ caused by compression or network delay. You cannot really ‘Pre-Process’ the live audio within Genesys Cloud, but you should ensure your agents are using the G.711 codec. If you are using G.729, the compression will strip out the higher frequencies that the AI uses for phonetic recognition, which will definitely kill your confidence scores!

I have been building some Kafka-based monitoring for our bot traffic. To follow up on Bia60, you should also look at the ‘Gain Control’ settings on your agents’ headsets. If the audio is too quiet or too loud, the bot will struggle even if the network is perfect. We found that implementing a ‘Hardware Standard’ for our remote agents significantly improved our bot accuracy. Do not let them use their laptop’s built-in microphone!

I have been reviewing the ‘Bot Transcripts’ for these choppy calls. Che75, you can actually see the ‘Audio Quality’ metrics for each bot interaction in the Analytics API. Look for the mediaStats object. If you see a high packetLossRate, you can automatically flag those calls for a network audit. It is a great way to proactively identify which remote agents need a better internet connection for the bot to work correctly!