Hello everyone! I am incredibly excited about the new AudioHook API! We are integrating a third-party voice biometrics engine for continuous authentication during live calls. However, I am noticing that when we stream the dual-channel audio via WebSockets using the AudioHook API, there is a slight latency that occasionally causes the biometrics engine to reject the sample! Has anyone optimized the SIP trunk settings or the AudioHook configuration to minimize audio packet delay for real-time streaming?
Hello. I have done extensive load testing on the AudioHook API. You must ensure your network infrastructure prioritizes WebSocket traffic.
Furthermore, check the sample rate. Genesys Cloud streams AudioHook payload in 8kHz PCMU by default.
If your biometrics engine expects 16kHz, the upsampling delay on your external server will cause latency. You cannot change the native Genesys transmission rate.
The latency is rarely the SIP trunk. The AudioHook component sits deep within the media tier. We encountered similar delays during outbound campaign automations utilizing continuous audio analysis.
The solution is disabling silence suppression on your carrier SIP trunk. If the carrier suppresses silence, the RTP stream stops, causing the WebRTC gateway inside Genesys to buffer and recalculate timestamps, which creates jitter for the downstream AudioHook WebSocket.