Implementing Custom KPI Calculations using Interaction Detail Records

Implementing Custom KPI Calculations using Interaction Detail Records

What This Guide Covers

This guide details the architectural pattern for extracting, normalizing, and computing custom Key Performance Indicators from Genesys Cloud CX Interaction Detail Records. When you complete the implementation, you will have a production-grade data pipeline that reconstructs granular interaction timelines, applies business-specific calculation logic, and outputs aggregated KPI datasets to your analytics layer without triggering platform rate limits or data corruption.

Prerequisites, Roles & Licensing

  • Licensing Tier: Genesys Cloud CX 2 or CX 3. Interaction Detail APIs require an Analytics 1, 2, or 3 tier add-on. Standard CX 1 licenses restrict access to summary-level reporting only.
  • Role Permissions: Analytics > Interaction Detail > Read, Analytics > Reports > Read, Routing > Queues > Read, Routing > Skills > Read (if skill-based routing metrics are included).
  • OAuth Scopes: analytics:interaction:read, analytics:report:read, routing:queue:read.
  • External Dependencies: A compute environment capable of handling asynchronous job queues (AWS Lambda, Azure Functions, or Kubernetes workers), a relational or columnar data store (PostgreSQL, Snowflake, BigQuery), and a scheduled orchestrator (Airflow, Prefect, or native cloud cron).

The Implementation Deep-Dive

1. Architecting the Extraction Pipeline

The Interaction Detail API does not return flat, denormalized rows. It returns discrete state-transition events for every interaction. Your pipeline must request these events in chronological order, reconstruct the timeline, and apply your calculation logic before aggregation. The extraction phase relies on cursor-based pagination combined with time-bound filtering to guarantee exactly-once processing semantics.

Begin by defining your extraction window. The API accepts since and until parameters in ISO 8601 format with UTC timezone designators. You must request data in non-overlapping windows to prevent duplicate processing. A standard production pattern uses 15-minute or 1-hour windows depending on your daily interaction volume. The endpoint requires the view=interactionDetail parameter to return granular event arrays rather than summary metrics.

GET /api/v2/analytics/interactions/detail?view=interactionDetail&since=2024-01-15T00:00:00Z&until=2024-01-15T00:15:00Z&pageSize=1000&cursor=
Host: api.mypurecloud.com
Authorization: Bearer <access_token>
Accept: application/json

The response payload contains an interactions array. Each interaction object includes an id, type (voice, chat, email, callback), and an events array. The events array holds the raw state transitions: wait, talk, hold, transfer, wrapUp, disposition, and system events. You must deserialize this payload and immediately persist the raw JSON to a staging table or object storage bucket before transformation. Never compute against live API responses in memory for production workloads. Memory exhaustion during peak hours will cause pipeline failures and require full window reprocessing.

The Trap: Using since and until as the sole pagination mechanism without tracking the nextPageToken (cursor) causes catastrophic data loss under high volume. The API caps response sizes. If your 15-minute window contains 50,000 interactions, the first request returns a subset and a nextPageToken. If your pipeline ignores the cursor and simply increments since by 15 minutes, you permanently skip all interactions that fall beyond the first page. The downstream effect is silent metric corruption. Your custom KPIs will consistently undercount volume, and handle time averages will skew artificially low because longer, more complex interactions are statistically more likely to be truncated by pagination limits.

Architectural Reasoning: We enforce a cursor-driven extraction loop inside each time window. The orchestrator spawns a worker for the window. The worker fetches page 1, processes the interactions array, extracts the nextPageToken, and loops until nextPageToken is null. Only after the cursor exhausts does the worker commit the window and move to the next time boundary. This guarantees completeness regardless of daily volume spikes.

2. Normalizing Event Streams and Reconstructing Interaction Timelines

Raw interaction events are not sequential by default. The API returns events grouped by interaction ID, but the internal events array may contain out-of-order timestamps due to system clock drift across media servers, or deferred event generation during transfer bridges. Your transformation layer must sort events by timestamp, filter out noise, and calculate derived durations.

Define a canonical event schema. You will need to extract eventType, timestamp, duration, routing.queue.id, routing.skill.id, and wrapUp.code. For voice interactions, the talk and hold events contain duration fields in seconds. For digital channels, wait and talk events represent agent typing/listening time and customer response windows. You must normalize all durations to a single unit (seconds) and strip non-business time.

Implement a state machine validator. Not all event sequences are valid. A wrapUp event cannot precede a talk event. A transfer event must have a corresponding wait event in the target queue. Your pipeline should flag malformed sequences rather than silently calculating metrics from broken timelines. Use a deterministic sort key: interaction.id + timestamp + eventType priority mapping.

def reconstruct_timeline(events: list) -> dict:
    # Sort by timestamp, with deterministic tie-breaking
    sorted_events = sorted(events, key=lambda e: (e["timestamp"], e["eventType"]))
    
    timeline = {
        "queue_wait": 0,
        "agent_talk": 0,
        "customer_hold": 0,
        "system_hold": 0,
        "wrap_up": 0,
        "transfers": 0,
        "start_time": sorted_events[0]["timestamp"],
        "end_time": sorted_events[-1]["timestamp"]
    }
    
    for event in sorted_events:
        event_type = event["eventType"]
        duration = event.get("duration", 0)
        
        if event_type == "wait":
            timeline["queue_wait"] += duration
        elif event_type == "talk":
            timeline["agent_talk"] += duration
        elif event_type == "hold":
            # Genesys distinguishes hold types via event metadata
            hold_type = event.get("holdType", "customer")
            if hold_type == "system" or event.get("systemInitiated", False):
                timeline["system_hold"] += duration
            else:
                timeline["customer_hold"] += duration
        elif event_type == "wrapUp":
            timeline["wrap_up"] += duration
        elif event_type == "transfer":
            timeline["transfers"] += 1
            
    return timeline

The Trap: Treating hold events as uniform customer holds. Genesys Cloud generates hold events for both customer-initiated holds and system-initiated holds (IVR re-prompting, transfer bridges, conference merging, or media server handoffs). If you include system hold time in your handle time calculation, your custom KPIs will inflate by 15 to 30 percent during peak hours. The downstream effect is unfair agent scoring, incorrect staffing models, and WFM schedule mismatches. Agents receive penalties for time they did not control, and workforce planners overstaff for artificial duration inflation.

Architectural Reasoning: We inspect the holdType field and the systemInitiated boolean flag within the event payload. System holds are excluded from agent-controlled metrics but retained in a separate system_latency column for network/IVR performance tracking. This separation allows business leaders to view pure agent performance while engineering teams monitor platform friction. The calculation logic explicitly subtracts system_hold from total_handle_time before applying custom weighting.

3. Computing Derived Metrics and Aggregating Results

Once timelines are reconstructed, you apply your custom KPI formulas. Standard Genesys reports calculate Handle Time as Talk + Wrap Up. Custom KPIs often require weighted calculations, threshold-based penalties, or multi-step journey conversions. Define your metric schema in your target database before ingestion. A common enterprise pattern is Weighted Effective Handle Time (WEHT), which applies multipliers based on channel complexity, transfer count, and wrap-up disposition.

-- Example aggregation query for custom KPI materialization
CREATE MATERIALIZED VIEW custom_kpi_daily AS
SELECT
    DATE_TRUNC('day', start_time) AS report_date,
    routing_queue_id,
    COUNT(*) AS interaction_volume,
    AVG(
        CASE 
            WHEN transfers > 2 THEN (agent_talk + wrap_up) * 1.5
            WHEN transfers = 1 THEN (agent_talk + wrap_up) * 1.2
            ELSE (agent_talk + wrap_up)
        END
    ) AS weighted_effective_handle_time,
    SUM(CASE WHEN queue_wait > 120 THEN 1 ELSE 0 END) AS sla_breaches_120s,
    AVG(queue_wait) AS avg_queue_wait,
    COUNT(CASE WHEN wrap_up_code IN ('Issue Resolved', 'Callback Scheduled') THEN 1 END) 
        / COUNT(*) AS first_contact_resolution_rate
FROM interaction_timelines
GROUP BY 1, 2;

You must implement idempotent upserts. Data pipelines inevitably retry failed windows. Your aggregation layer must use ON CONFLICT or MERGE statements keyed on interaction.id and processing_window_id. Never use INSERT for production KPI pipelines. Duplicate inserts will double-count volume and corrupt rolling averages.

Implement a metric validation checkpoint. Before publishing to BI tools, run a sanity check against Genesys Cloud’s native summary API. Compare your extracted interaction_volume against the /api/v2/analytics/interactions/summary endpoint for the same window. Allow a 0.5 percent tolerance for event processing lag. If the variance exceeds the threshold, halt the pipeline and trigger a reprocess. This prevents corrupted custom metrics from reaching executive dashboards.

The Trap: Calculating averages before filtering out test or internal interactions. Quality assurance teams, training supervisors, and system health checks generate interactions that route through production queues but lack customer PII or valid dispositions. If your pipeline ingests these records, your custom KPIs will show artificially low handle times and inflated resolution rates. The downstream effect is loss of stakeholder trust. Business leaders will question data accuracy when manual audits reveal discrepancies, and subsequent budget approvals for CX initiatives will be delayed or denied.

Architectural Reasoning: We apply a pre-aggregation filter using wrapUp.code exclusions and routing.queue.id whitelists. Internal queues (e.g., QA_Review, Training_Simulation, System_Health) are excluded at the ingestion layer, not the reporting layer. This reduces compute costs and guarantees that only customer-facing interactions enter the KPI calculation engine. We also validate that agent.id exists and that talk duration exceeds a minimum threshold (e.g., 3 seconds) to discard dropped calls and bot handoffs that lack meaningful agent engagement.

4. Optimizing for Scale and Avoiding Rate Limiting

The Interaction Detail API enforces strict rate limits. The analytics endpoints share a global quota, typically capping at 100 requests per second across all OAuth applications. High-volume contact centers generating 200,000+ daily interactions will exhaust this quota if extraction windows are too granular or if retry logic is unbounded. You must implement exponential backoff, request coalescing, and predictive window sizing.

Calculate your daily API budget. Divide your total daily interactions by the maximum pageSize (typically 10,000 for detail endpoints, though 1,000 is safer for stability). Multiply by the number of retries you anticipate. If you need 50 requests per hour to cover your volume, space your extraction jobs at 1.2-minute intervals to maintain a safety margin. Implement a token bucket algorithm in your orchestrator to throttle requests dynamically based on 429 Too Many Requests responses.

import time
import requests

def fetch_with_backoff(url: str, headers: dict, max_retries: int = 5) -> dict:
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)
        if response.status_code == 200:
            return response.json()
        elif response.status_code == 429:
            retry_after = int(response.headers.get("Retry-After", 2 ** attempt))
            time.sleep(retry_after)
        else:
            raise Exception(f"API Error {response.status_code}: {response.text}")
    raise Exception("Max retries exceeded for Interaction Detail extraction")

Implement parallel extraction with queue-aware throttling. Instead of a single worker processing sequential 15-minute windows, deploy a worker pool where each worker handles a distinct routing queue or media type. Voice, chat, and email interactions have different volume profiles. Parallelizing by channel prevents a single high-volume channel from blocking extraction for others. Apply per-worker rate limiters to ensure the aggregate request rate stays below the platform quota.

Cache reference data locally. The routing.queue.id and wrapUp.code fields are foreign keys. Resolving them via the Routing API on every extraction cycle wastes quota and increases latency. Fetch queue metadata, skill assignments, and wrap-up configurations once per hour, cache them in Redis or memory, and join them during transformation. This reduces API calls by 40 to 60 percent in high-configuration environments.

The Trap: Using synchronous, blocking API calls inside a monolithic script without connection pooling or timeout controls. Genesys Cloud endpoints will hold open connections during high load, causing thread exhaustion in your compute environment. When threads pool, your pipeline stalls, extraction windows overlap, and duplicate processing triggers. The downstream effect is cascading infrastructure failure. Your cloud compute costs spike due to idle worker scaling, data freshness degrades by hours, and real-time KPI dashboards display stale or missing data during critical business hours.

Architectural Reasoning: We deploy asynchronous, non-blocking HTTP clients with connection pooling (e.g., aiohttp in Python or axios with http-agent in Node.js). We set strict timeouts (15 seconds for detail endpoints, 30 seconds for summary validation). We implement dead-letter queues for failed extraction windows. If a window fails after max retries, it is routed to a manual review queue rather than blocking the pipeline. This ensures forward progress and isolates failures to specific time boundaries without corrupting the broader dataset.

Validation, Edge Cases & Troubleshooting

Edge Case 1: Cursor Drift and Duplicate Records

  • The failure condition: Your pipeline processes the same interaction twice across two consecutive extraction windows, causing double-counted volume and inflated handle time averages.
  • The root cause: The nextPageToken expires or becomes invalid after 24 hours. If your orchestrator pauses between windows and resumes later, the cursor may point to stale data, or the API may return a partial page that overlaps with the next window’s since boundary.
  • The solution: Implement deduplication at the ingestion layer using interaction.id as a primary key with ON CONFLICT DO NOTHING. Track the last successfully processed cursor and until timestamp in a state table. Before starting a new window, query the state table to verify that the previous window’s cursor was fully consumed. If the cursor is missing, trigger a targeted reprocess of only the overlapping boundary rather than the entire day.

Edge Case 2: Timezone Shifts and Cross-Day Interaction Boundaries

  • The failure condition: Interactions that start at 23:55 UTC and wrap up at 00:05 UTC are split across two reporting days, causing incomplete timeline reconstruction and missing wrap-up codes.
  • The root cause: The since and until parameters truncate interactions at the exact timestamp boundary. Genesys Cloud returns interactions that were created within the window, but events that occur after until are excluded from the response. Long wrap-up periods or delayed disposition submissions fall outside the extraction window.
  • The solution: Apply a forward buffer of 30 to 60 minutes to your until parameter during extraction. When reconstructing timelines, filter events strictly by interaction.startTimestamp rather than the extraction window. During aggregation, assign interactions to reporting days using DATE_TRUNC('day', start_time) in your data warehouse, not the extraction window boundary. This guarantees complete event capture regardless of wrap-up duration.

Edge Case 3: API Throttling Under Peak Volume Spikes

  • The failure condition: Your pipeline hits 429 Too Many Requests repeatedly during morning peak hours, causing extraction delays that push KPI calculations past business reporting deadlines.
  • The root cause: Sudden volume spikes (e.g., campaign launches, outages, holiday shopping) increase interaction creation rates beyond your static window sizing. Fixed 15-minute windows suddenly require 3x the API calls, exhausting your quota.
  • The solution: Implement dynamic window sizing based on real-time interaction velocity. Query the /api/v2/analytics/interactions/summary endpoint with a 5-minute lookback to calculate current interactions-per-minute. Adjust your extraction window size inversely: higher velocity triggers smaller windows with higher parallelism, lower velocity triggers larger windows with reduced parallelism. Maintain a global rate limiter that caps total requests per second across all workers, automatically queuing excess jobs until quota replenishes.

Official References