Designing Coaching Whisper Bots that Provide Real-Time Guidance to Junior Agents
What This Guide Covers
This guide details the architectural pattern for building real-time whisper bots within Genesys Cloud CX that deliver contextual coaching prompts to junior agents during live interactions. You will configure a Speech Analytics workflow to detect specific trigger phrases, route the context to an Architect flow, and inject a whisper message into the agent’s client without interrupting the customer audio stream.
Prerequisites, Roles & Licensing
- Licensing:
- Genesys Cloud CX 1 or 2: Required for the Speech Analytics workflow engine.
- Genesys Cloud CX 3: Recommended for advanced sentiment analysis and custom NLP models.
- WEM Add-on (Optional but Recommended): Required if you wish to capture the whisper interaction as a coaching event for post-call review and scoring.
- Permissions:
analytics:speechanalytics:edit(To create and publish Speech Analytics workflows).architect:flow:edit(To build the whisper injection logic).routing:queue:edit(To assign the bot to agent queues).telephony:agent:edit(To manage agent profiles and skills).
- External Dependencies:
- A configured Speech Analytics instance with an active language model (English, Spanish, etc.).
- A Knowledge Base or external API endpoint if the whisper bot needs to fetch dynamic content (e.g., product availability, policy updates).
- Architect flows with permissions to send
Agent Whisperevents.
The Implementation Deep-Dive
1. Architecting the Speech Analytics Workflow
The core of the whisper bot is not in the UI, but in the asynchronous processing pipeline of Speech Analytics. You are building a listener that runs in parallel with the live call. The goal is low latency detection. If the detection takes longer than 5-8 seconds, the coaching moment has passed, and the whisper becomes noise rather than guidance.
Step 1.1: Define the Trigger Pattern
Do not rely on simple keyword matching. Keyword matching yields high false positives. Instead, use Natural Language Understanding (NLU) intents or Semantic Phrase Detection.
- Navigate to Admin > Analytics > Speech Analytics.
- Create a new Workflow. Name it
Whisper_Bot_Junior_Agent_Coaching. - Add a Condition block. Select
Speech Analyticsas the data source. - Configure the condition to detect a specific Intent or Phrase.
- Example: Detect the intent
Customer_Complaint_Pricing. - Alternative: Detect the phrase
"I want to cancel"combined with sentimentNegative > 0.7.
- Example: Detect the intent
The Trap: Configuring the workflow to trigger on generic words like “problem” or “issue.”
The Downstream Effect: Your junior agents will receive a whisper notification on 80% of calls. This causes “alert fatigue.” Agents will begin to ignore the whisper interface entirely, rendering the coaching bot useless. You must tune the precision of your NLU model to trigger only on high-stakes scenarios (e.g., escalation requests, policy violations, or specific product inquiries).
Step 1.2: Configure the Real-Time Action
Speech Analytics workflows can trigger real-time actions. You will use this to push context to an Architect flow.
- In the workflow, add an Action block.
- Select Send Webhook or Call Architect Flow.
- Recommendation: Use Call Architect Flow if you need complex logic (e.g., checking if the agent is actually on a call, if they are junior, etc.). Use Send Webhook if you have an external middleware handling the logic.
- Map the payload. You must pass the
interactionId,agentId, and thedetectedIntent.
Payload Structure for Architect Invocation:
{
"interactionId": "{{interaction.id}}",
"agentId": "{{agent.id}}",
"detectedIntent": "{{speechAnalytics.intent.name}}",
"confidenceScore": "{{speechAnalytics.intent.confidence}}",
"transcriptSnippet": "{{speechAnalytics.transcript.snippet}}"
}
The Trap: Forgetting to include the agentId in the payload.
The Downstream Effect: The Architect flow cannot identify which agent to whisper to. The whisper fails silently, or worse, broadcasts to the entire queue if you default to a broadcast action. Always pass the specific agent identifier.
2. Building the Whisper Injection Logic in Architect
Now that the trigger is fired, you need the logic that decides if and how to whisper. This is where you enforce business rules and prevent harassment of the agent.
Step 2.1: The Decision Gate
Create a new Architect Flow named Flow_Whisper_Coaching_Engine.
- Start Trigger: Set to
WebhookorSpeech Analytics(depending on your Step 1.2 choice). - Set Variable: Store the incoming payload in a local variable
whisperPayload. - Decision Block: Check if the agent is eligible for coaching.
- Query the Agent Profile using the
agentIdfrom the payload. - Check a custom attribute or skill. For example, ensure the agent has the skill
Junior_Staffor the custom attributeCoaching_Levelequals1. - Reasoning: You do not want to whisper to senior agents or supervisors. They may find it intrusive or redundant.
- Query the Agent Profile using the
The Trap: Not checking the agent’s current call state.
The Downstream Effect: If the agent has just hung up or is in a wrap-up, the whisper will fail or appear in the wrong context. Always verify the interaction is ACTIVE and the agent is ON_CALL.
Step 2.2: Content Generation and Throttling
You must prevent the bot from whispering multiple times for the same issue within a single interaction.
- Set Variable: Create a list variable
whisperHistoryfor the currentinteractionId. - Check List: Search
whisperHistoryfor the currentdetectedIntent. - Decision:
- If the intent is already in the list, End Flow.
- If not, proceed to generate the whisper message.
The Trap: Sending verbose, paragraph-long whispers.
The Downstream Effect: Agents cannot read a paragraph while listening to a customer. The cognitive load breaks their concentration. The whisper must be a single sentence or a bullet point. Limit the whisper text to <50 characters.
Step 2.3: Sending the Whisper
Use the Agent Whisper block in Architect.
- Drag the Agent Whisper block into the flow.
- Configure the Recipient: Use the
agentIdfrom the payload. - Configure the Message:
- Static:
"Verify refund policy for items > 30 days." - Dynamic: Use a Set Variable block to fetch content from a Knowledge Base article based on the
detectedIntent.
- Static:
- Append to History: Add the
detectedIntentto thewhisperHistorylist.
Code Snippet: Architect Expression for Dynamic Whisper
If you are fetching from a Knowledge Base, you might use an API Call block first. However, for speed, pre-map intents to messages in a Set Variable block.
{
"intent": "Pricing_Complaint",
"whisper": "Offer 10% discount if retention fails. Escalate if >$500."
}
The Trap: Using the Broadcast action instead of Agent Whisper.
The Downstream Effect: Broadcasting sends the message to all agents in the queue. This exposes sensitive customer context (if included in the whisper) to other agents, violating privacy policies (HIPAA/GDPR). Always use Agent Whisper for individual coaching.
3. Integrating with WEM for Post-Call Accountability
A whisper bot is only as good as the feedback loop. If you whisper guidance, you must verify if the agent followed it. This requires integration with Workforce Engagement Management (WEM).
Step 3.1: Tagging the Interaction
In the same Architect flow, after sending the whisper, add a Set Interaction Data block.
- Set a custom data field:
coaching_whisper_triggered = true. - Set another field:
coaching_topic = {{detectedIntent}}.
This metadata persists on the interaction record.
Step 3.2: Configuring the WEM Scoring Card
- Navigate to WFM > WEM > Scoring Cards.
- Create a new category:
Real-Time Coaching Compliance. - Add a question:
"Did the agent follow the whisper guidance?" - Configure the question to only appear if
coaching_whisper_triggered = true. - Assign this scoring card to the quality profile used for junior agents.
The Trap: Failing to train supervisors on the new scoring criteria.
The Downstream Effect: Supervisors will ignore the whisper context during QA. The data becomes dead weight. You must update your QA rubrics to explicitly reward adherence to real-time guidance.
The Trap: Punishing agents for non-compliance without context.
The Downstream Effect: If the whisper was incorrect or the customer situation was unique, the agent may have rightly ignored it. Ensure the WEM scoring allows for “Agent Override” notes. The whisper is a suggestion, not a command.
Validation, Edge Cases & Troubleshooting
Edge Case 1: Latency-Induced Irrelevance
The Failure Condition: The agent detects the customer’s complaint, resolves it, and the customer is satisfied. Two seconds later, the whisper pops up: "Offer 10% discount." The agent is confused and the moment is lost.
The Root Cause: Speech Analytics processing latency. The audio chunk is sent to the cloud, processed by the NLU model, the workflow evaluates, and the webhook fires. This chain can take 5-10 seconds.
The Solution:
- Reduce Audio Chunk Size: In Speech Analytics settings, reduce the buffer size for real-time analysis. This increases API calls but reduces latency.
- Client-Side Pre-Filtering: Use Genesys Cloud’s Real-Time Analytics API to listen for keywords client-side (if available in your version) or use a lightweight regex filter before sending to the heavy NLU model.
- Graceful Degradation: In the Architect flow, add a decision:
"Is the interaction duration > 2 minutes?"If the call is very short, skip the whisper. Short calls are likely simple queries where coaching is less critical.
Edge Case 2: The “Silent” Agent
The Failure Condition: The agent is on mute or speaking very quietly. The Speech Analytics engine detects no speech, so no triggers fire. However, the customer is complaining. The agent receives no coaching.
The Root Cause: Speech Analytics relies on audio input. If the audio quality is poor or the agent is muted, the transcription fails.
The Solution:
- Dual-Stream Analysis: Ensure your Speech Analytics configuration is set to analyze both agent and customer streams. Most importantly, trigger coaching based on Customer speech, not Agent speech. The customer says “I want to cancel,” not the agent.
- Fallback to Screen Pop: If audio analysis fails, use Screen Pop events from your CRM. If the CRM detects a “Cancellation Request” form submission, trigger the whisper via the CRM integration API, bypassing Speech Analytics entirely.
Edge Case 3: Over-Whispering in Complex Conversations
The Failure Condition: A customer has a complex issue involving billing, shipping, and product defects. The whisper bot triggers for billing, then shipping, then defects. The agent is bombarded with three whispers in 30 seconds.
The Root Cause: Lack of state management in the Architect flow. Each trigger is treated as an independent event.
The Solution:
- Cooldown Timer: Implement a global cooldown variable in the Architect flow. Set a variable
lastWhisperTime. IfcurrentTime - lastWhisperTime < 60 seconds, suppress the whisper. - Priority Queue: Assign priorities to intents.
Escalation= High,Billing= Medium,Chit-Chat= Low. If a High priority whisper is active, suppress Medium/Low whispers for 30 seconds. - Batching: Instead of whispering immediately, accumulate triggers for 10 seconds. If multiple triggers occur, send a single consolidated whisper:
"Issues detected: Billing, Shipping. Review policy doc #123."