I am currently troubleshooting a latency issue with our BYOC Cloud external trunks. I have ‘Options Pinging’ enabled for our primary carrier, but I am seeing a high volume of ‘Trunk Out of Service’ alerts in our dev org, even when the carrier says their network is fine. I suspect that the default ping interval is too aggressive and is causing the Edges to mark the trunk as down due to a single dropped packet. What is the recommended ‘Options Ping’ interval and ‘Failure Threshold’ for a stable BYOC Cloud production environment?
Hello Ele68. I am an integration engineer and I have seen these false alerts break our Workato automation. You should never use a ping interval shorter than thirty seconds for a cloud-to-cloud trunk. The public internet has enough jitter that a ten-second ping will eventually fail due to a minor routing change. I recommend setting your interval to sixty seconds with a failure threshold of three consecutive missed pings. This provides a good balance between fast failover and stability.
Greetings. I build flows for many clients and I always recommend matching the ping settings to the carrier’s specific recommendations. Some carriers will actually rate limit your Edges if you ping them too often! Also, check if you have ‘SIP INVITE’ pings enabled. Sometimes a carrier will respond to an OPTIONS ping but will fail an INVITE ping due to a configuration mismatch on their side.
I deal with the legal discovery for these outages. Ele68, please make sure you are monitoring the ‘Edge Health’ as well. If your Edges are overloaded, they might delay the OPTIONS pings, which will cause the trunk to appear down even if the carrier is fine. I have seen this happen during peak traffic events where the Edge CPU was hitting ninety percent.