Designing Generative AI Summarization Guardrails to Prevent Hallucinated After-Call Work (ACW)

StarAdmin · November 28, 2025, 9:00am

Designing Generative AI Summarization Guardrails to Prevent Hallucinated After-Call Work (ACW)

What This Guide Covers

You are implementing enterprise-grade guardrails around your Genesys Cloud Copilot or custom LLM-based after-call summarization pipeline to prevent hallucinated summaries from corrupting your CRM records, triggering incorrect case escalations, or creating false compliance audit trails. When complete, every AI-generated ACW summary is validated against the actual conversation transcript before it is written to the CRM - summaries containing claims not supported by the transcript are flagged for agent review rather than auto-saved, your false ACW rate drops below 0.5%, and agents spend less time correcting AI errors than they saved by using AI in the first place.

Prerequisites, Roles & Licensing

Genesys Cloud: CX 3 with Agent Copilot; or a custom post-call summarization pipeline using the Recording/Transcript API
LLM: OpenAI GPT-4o, Anthropic Claude 3 Sonnet, or AWS Bedrock (Claude/Titan)
Permissions:
- Recording > Recording > View (to fetch transcripts)
- Conversations > Conversation > View
- Analytics > Conversation Detail > View

The Implementation Deep-Dive

1. The Hallucination Risk Taxonomy for ACW Summaries

Not all hallucinations are equal. In ACW context, hallucination types have different business impact:

Hallucination Type	Example	Business Impact
Factual fabrication	“Customer confirmed the order was delivered” (order was never mentioned)	HIGH - creates false audit record
Sentiment inversion	“Customer was satisfied” (customer threatened to cancel)	HIGH - masks churn risk
Action item invention	“Agent agreed to issue a $50 credit” (never discussed)	CRITICAL - financial liability
Identity confusion	“John Smith called” (caller was Jane Doe, confirmed by auth)	HIGH - compliance violation
Date/number hallucination	“Case number 12345” (actual case was 54321)	MEDIUM - CRM data quality
Scope exaggeration	“Agent resolved all three issues” (only one was addressed)	MEDIUM - SLA distortion

Your guardrail system must detect and block all HIGH and CRITICAL categories before any summary is auto-saved to the CRM.

2. Constrained Summarization Prompt Design

The first guardrail is at the prompt level - structure the LLM prompt to minimize hallucination risk:

CONSTRAINED_SUMMARIZATION_PROMPT = """You are a contact center ACW summarization assistant.

STRICT RULES:
1. ONLY include information explicitly stated in the transcript below.
2. If information is ambiguous or unclear, write "unclear" - do NOT infer.
3. If a field cannot be filled from the transcript, write "not discussed".
4. NEVER add context from outside the transcript.
5. Use exact quotes when capturing commitments (action items).

TRANSCRIPT:
{transcript}

Fill in the following structured summary:
CUSTOMER_ISSUE: [1-2 sentences describing the primary reason for contact]
RESOLUTION_STATUS: [Resolved | Unresolved | Escalated | Transferred]
AGENT_COMMITMENTS: [List any specific commitments made by the agent - exact quotes preferred]
FOLLOW_UP_REQUIRED: [Yes/No - describe if yes]
CUSTOMER_SENTIMENT: [Positive | Neutral | Negative | Escalated]
KEY_REFERENCES: [Case numbers, order IDs, product codes mentioned]

Output JSON only. No explanatory text."""

def generate_structured_summary(transcript: str, model: str = "gpt-4o") -> dict:
    import openai, json
    client = openai.OpenAI()
    
    resp = client.chat.completions.create(
        model=model,
        temperature=0.0,  # Deterministic - critical for consistency
        response_format={"type": "json_object"},
        messages=[
            {"role": "user", "content": CONSTRAINED_SUMMARIZATION_PROMPT.format(transcript=transcript)}
        ]
    )
    
    return json.loads(resp.choices[0].message.content)

temperature=0.0 eliminates probabilistic variation - the same transcript always produces the same summary, making it auditable and reproducible.

3. Post-Generation Fact Verification Pipeline

The core guardrail is a second LLM pass that verifies each claim in the generated summary against the transcript:

FACT_VERIFICATION_PROMPT = """You are a fact-checker for contact center AI summaries.

TRANSCRIPT:
{transcript}

GENERATED SUMMARY CLAIM:
"{claim}"

Does the transcript explicitly support this claim?
Answer with JSON only:
{{
  "supported": true/false,
  "confidence": 0.0-1.0,
  "evidence": "exact quote from transcript that supports/contradicts this claim",
  "verdict": "SUPPORTED | UNSUPPORTED | PARTIALLY_SUPPORTED"
}}"""

def verify_summary_claims(summary: dict, transcript: str) -> dict:
    """
    Verify each field in the summary against the transcript.
    Returns a verification report with per-field verdicts.
    """
    import openai, json
    client = openai.OpenAI()
    
    HIGH_RISK_FIELDS = ["AGENT_COMMITMENTS", "RESOLUTION_STATUS", "KEY_REFERENCES"]
    
    verification_results = {}
    overall_pass = True
    
    for field, value in summary.items():
        if value in ("not discussed", "unclear", "No"):
            verification_results[field] = {"verdict": "SKIPPED", "reason": "Empty or negative field"}
            continue
        
        resp = client.chat.completions.create(
            model="gpt-4o-mini",  # Use cheaper model for verification - speed matters
            temperature=0.0,
            response_format={"type": "json_object"},
            messages=[
                {"role": "user", "content": FACT_VERIFICATION_PROMPT.format(
                    transcript=transcript,
                    claim=f"{field}: {value}"
                )}
            ]
        )
        
        result = json.loads(resp.choices[0].message.content)
        verification_results[field] = result
        
        # High-risk fields with UNSUPPORTED verdict trigger a block
        if field in HIGH_RISK_FIELDS and result["verdict"] == "UNSUPPORTED":
            overall_pass = False
    
    return {
        "passed": overall_pass,
        "fields": verification_results,
        "blockedFields": [
            f for f, r in verification_results.items()
            if r.get("verdict") == "UNSUPPORTED" and f in HIGH_RISK_FIELDS
        ]
    }

The Trap - using the same LLM for both generation and verification: The same model that generates a hallucinated claim often confirms it during verification - the model is self-consistent but wrong. For verification, either use a different model (GPT-4o for generation, Claude Sonnet for verification) or use a deterministic substring search instead of a second LLM call. For critical fields like AGENT_COMMITMENTS, regex-search the transcript for the exact quoted commitment rather than relying on an LLM to verify it.

4. Deterministic Verification for High-Risk Fields

For AGENT_COMMITMENTS and KEY_REFERENCES, replace LLM verification with deterministic checks:

import re
from difflib import SequenceMatcher

def verify_commitment_in_transcript(commitment: str, transcript: str, fuzzy_threshold: float = 0.8) -> dict:
    """
    Verify an agent commitment claim using fuzzy string matching against the transcript.
    More reliable than a second LLM call for exact factual claims.
    """
    # Clean both strings
    clean_commitment = re.sub(r'["\']', '', commitment).lower().strip()
    clean_transcript = transcript.lower()
    
    # Check for exact or near-exact substring
    words = clean_commitment.split()
    
    # Sliding window search for best match
    best_score = 0.0
    best_position = -1
    window_size = len(words)
    transcript_words = clean_transcript.split()
    
    for i in range(len(transcript_words) - window_size + 1):
        window = " ".join(transcript_words[i:i + window_size])
        score = SequenceMatcher(None, clean_commitment, window).ratio()
        if score > best_score:
            best_score = score
            best_position = i
    
    return {
        "verdict": "SUPPORTED" if best_score >= fuzzy_threshold else "UNSUPPORTED",
        "confidence": round(best_score, 3),
        "method": "fuzzy_string_match",
        "threshold": fuzzy_threshold
    }

def extract_and_verify_references(references_field: str, transcript: str) -> dict:
    """
    Extract case numbers/order IDs from the summary and verify they appear in the transcript.
    """
    # Extract numeric patterns from the references field
    claimed_numbers = re.findall(r'\b\d{4,}\b', references_field)
    
    verified = []
    unverified = []
    
    for num in claimed_numbers:
        if num in transcript:
            verified.append(num)
        else:
            unverified.append(num)
    
    return {
        "verdict": "SUPPORTED" if not unverified else "UNSUPPORTED",
        "verifiedReferences": verified,
        "unverifedReferences": unverified,
        "method": "exact_substring"
    }

5. Full Pipeline: Generate, Verify, Route

def process_acw_summary(conversation_id: str, access_token: str, base_url: str) -> dict:
    """
    Full ACW pipeline: fetch transcript → generate summary → verify → route.
    """
    # Fetch transcript
    transcript = fetch_conversation_transcript(conversation_id, access_token, base_url)
    
    if not transcript or len(transcript) < 100:
        return {"status": "SKIPPED", "reason": "Transcript too short or unavailable"}
    
    # Generate summary
    summary = generate_structured_summary(transcript)
    
    # Verify high-risk fields deterministically
    commitment_check = verify_commitment_in_transcript(
        summary.get("AGENT_COMMITMENTS", ""),
        transcript
    ) if summary.get("AGENT_COMMITMENTS") not in ("not discussed", ""):
    else {"verdict": "SKIPPED"}
    
    reference_check = extract_and_verify_references(
        summary.get("KEY_REFERENCES", ""),
        transcript
    ) if summary.get("KEY_REFERENCES") not in ("not discussed", "")
    else {"verdict": "SKIPPED"}
    
    # LLM verification for lower-risk fields
    llm_verification = verify_summary_claims(summary, transcript)
    
    # Routing decision
    critical_block = (
        commitment_check.get("verdict") == "UNSUPPORTED" or
        reference_check.get("verdict") == "UNSUPPORTED"
    )
    
    soft_flag = not llm_verification["passed"]
    
    if critical_block:
        status = "BLOCKED"
        action = "Requires agent review - unverified commitment or reference detected"
    elif soft_flag:
        status = "FLAGGED"
        action = "Summary surfaced to agent for confirmation before saving"
    else:
        status = "APPROVED"
        action = "Auto-save to CRM"
    
    result = {
        "conversationId": conversation_id,
        "summary": summary,
        "status": status,
        "action": action,
        "verification": {
            "commitmentCheck": commitment_check,
            "referenceCheck": reference_check,
            "llmVerification": llm_verification
        }
    }
    
    if status == "APPROVED":
        write_summary_to_crm(conversation_id, summary)
    else:
        surface_summary_for_agent_review(conversation_id, summary, result["verification"])
    
    return result

6. Monitoring: Hallucination Rate Dashboard

Track hallucination rates to measure guardrail effectiveness:

def emit_summary_quality_metrics(result: dict):
    import boto3
    cloudwatch = boto3.client("cloudwatch")
    
    cloudwatch.put_metric_data(
        Namespace="ACWSummarization",
        MetricData=[
            {"MetricName": "SummaryApproved", "Value": 1 if result["status"] == "APPROVED" else 0, "Unit": "Count"},
            {"MetricName": "SummaryBlocked", "Value": 1 if result["status"] == "BLOCKED" else 0, "Unit": "Count"},
            {"MetricName": "SummaryFlagged", "Value": 1 if result["status"] == "FLAGGED" else 0, "Unit": "Count"},
        ]
    )

Target KPIs:

Metric	Target	Alert
Auto-approved rate	> 85%	< 70%
Blocked (critical hallucination)	< 2%	> 5%
Flagged (soft flag)	< 13%	> 20%
Agent override rate (approved then corrected)	< 1%	> 3%

Validation, Edge Cases & Troubleshooting

Edge Case 1: Transcript Quality Degradation

Low-quality transcripts (heavy accent, noisy environment, crosstalk) produce inaccurate ASR output. The LLM summarizes the inaccurate transcript faithfully - the summary is wrong, but not technically a hallucination. Add a transcript confidence score check before summarization: if the average ASR word confidence is below 0.75, mark the summary as TRANSCRIPT_QUALITY_INSUFFICIENT and route to agent to write manually rather than auto-generating.

Edge Case 2: Long Calls Exceeding LLM Context Window

Calls longer than 60 minutes produce transcripts exceeding typical LLM context windows (128K tokens for GPT-4o handles ~96K words - most calls fit, but conference calls and complex cases may not). Implement a hierarchical summarization: split the transcript into 15-minute segments, summarize each segment, then summarize the segment summaries into a final ACW. Verify the final summary against the segment summaries rather than the full transcript.

Edge Case 3: Agent Overriding Blocked Summaries with Incorrect Content

Your guardrail blocks a summary for unverified commitment detection. The agent overrides the block and manually types an incorrect commitment anyway. Implement an audit log of all agent overrides: when an agent saves a summary that was originally BLOCKED, log the override with the agent ID, the blocked reason, and the final saved content. Include overrides in monthly QA sampling to identify agents with high override rates.

Edge Case 4: Multi-Party Calls Confusing Speaker Attribution

Conference calls with multiple customer representatives (“I’m going to add my colleague Sarah to the call”) cause the LLM to confuse which person made which statement. If your transcript includes speaker labels (Agent / Customer / External), pass the speaker-labeled transcript to the summarization prompt explicitly and instruct the LLM to attribute commitments to the labeled speaker. For unlabeled multi-party calls, set the AGENT_COMMITMENTS confidence to LOW and always route to agent review.

Designing Generative AI Summarization Guardrails to Prevent Hallucinated After-Call Work (ACW)

Designing Generative AI Summarization Guardrails to Prevent Hallucinated After-Call Work (ACW)

What This Guide Covers

Prerequisites, Roles & Licensing

The Implementation Deep-Dive

1. The Hallucination Risk Taxonomy for ACW Summaries

2. Constrained Summarization Prompt Design

3. Post-Generation Fact Verification Pipeline

4. Deterministic Verification for High-Risk Fields

5. Full Pipeline: Generate, Verify, Route

6. Monitoring: Hallucination Rate Dashboard

Validation, Edge Cases & Troubleshooting

Edge Case 1: Transcript Quality Degradation

Edge Case 2: Long Calls Exceeding LLM Context Window

Edge Case 3: Agent Overriding Blocked Summaries with Incorrect Content

Edge Case 4: Multi-Party Calls Confusing Speaker Attribution

Official References