Running into a weird bug with BYOC Trunk Failover Reporting in Genesys Cloud

Quick question, has anyone seen this weird error? with the GET /api/v2/analytics/details endpoint when querying call quality metrics across our 15 BYOC trunks. The data returned for the Singapore region trunks (using Carrier A) shows a 100% success rate in the last 4 hours, but the actual call logs in the WFO dashboard show a 15% drop rate during the same window. This discrepancy is causing our SLA reporting to be completely off.

The environment details are as follows:

  • Genesys Cloud Version: Latest (as of 2024-05-20)
  • BYOC Trunks: 15 SIP trunks, 10 in APAC, 5 in EMEA
  • Carrier A (APAC): Uses standard SIP 200 OK for success, 480 Temporarily Unavailable for failover
  • Carrier B (EMEA): Uses SIP 404 Not Found for immediate rejection

Steps to reproduce the issue:

  1. Set up outbound routing rules with Carrier A as primary and Carrier B as secondary for APAC numbers.
  2. Simulate a batch of 500 outbound calls during peak hours (09:00-10:00 SGT) where Carrier A is experiencing high latency.
  3. Observe that Carrier A returns SIP 408 Request Timeout for ~15% of calls.
  4. Check the GET /api/v2/analytics/details endpoint with view=call_quality and dateFrom/dateTo covering the test window.
  5. The API returns call_outbound_total as 500, but call_outbound_failed as 0.
  6. Compare this with the WFO dashboard, which correctly shows 75 failed calls.

The failover logic in Architect seems to be working correctly, as the calls are rerouted to Carrier B and completing successfully. However, the analytics API is not counting the initial SIP 408 responses as failures. This is critical for our carrier billing reconciliation and performance tracking.

Has anyone else seen this behavior with SIP 408 responses in the analytics API? Is there a known delay in data aggregation for BYOC trunk metrics? I have verified the trunk credentials and SIP registration status, and everything looks healthy. The issue seems specific to how the API interprets transient SIP errors versus final failure codes.

try filtering by routing_status instead of just answer_status. sometimes the analytics api misses the failover handoff if the initial leg fails. also check if your jmeter load test is hitting the same trunks; high throughput can cause temporary reporting gaps in eu-west.