Implementing Keyword-Triggered Screen Recording Bookmark Generation for Rapid Review
What This Guide Covers
This guide details the architectural implementation of real-time keyword detection within Genesys Cloud CX to automatically generate timestamped bookmarks in desktop screen recordings. The end result is a searchable index of agent interactions where specific terms trigger immediate navigation points in the recording, eliminating manual review time.
Prerequisites, Roles & Licensing
- Licensing Tier: Genesys Cloud CX 1 (for basic Speech Analytics) or CX 2/CX 3 (for advanced custom keywords and real-time streaming). You must have the Workforce Engagement Management (WEM) add-on enabled.
- Permissions:
Analytics > Export > ViewAnalytics > Speech Analytics > ManageAdministration > Users > EditAdministration > Settings > Edit
- OAuth Scopes:
analytics:speechanalytics:read,analytics:speechanalytics:write,analytics:export:read. - External Dependencies: None strictly required for the core feature, but integration with external case management systems via API requires a valid middleware layer (e.g., MuleSoft, Boomi) if you intend to push bookmarks out of Genesys.
The Implementation Deep-Dive
1. Configuring the Speech Analytics Keyword Engine
The foundation of this architecture is the Speech Analytics (SA) engine. You are not simply searching for text; you are configuring a probabilistic model to identify intent and context. If you configure keywords as simple string matches, you will encounter high false-positive rates, leading to bookmark fatigue.
Step 1.1: Define the Keyword Hierarchy
Navigate to Admin > Speech Analytics > Keywords. You must create a structured hierarchy. Do not dump all keywords into the root.
- Create a new Keyword Group named
Operational_Escalations. - Inside this group, create specific Keywords:
Supervisor_RequestRefund_Policy_InquirySystem_Error_Report
The Trap: Configuring broad, single-word keywords like “Problem” or “Error.”
The Downstream Effect: The SA engine will flag every instance of “No problem” or “It worked without error,” creating hundreds of irrelevant bookmarks per hour. This causes agents to ignore the bookmarking feature entirely because they perceive it as noise.
The Architectural Fix: Use Phrase Keywords with mandatory exclusions. For Supervisor_Request, configure the phrase “transfer me to a manager” AND exclude phrases containing “just kidding” or “no need.”
Step 1.2: Configure Real-Time Streaming
Standard Speech Analytics processes recordings asynchronously. For immediate bookmark generation, you must enable Real-Time Speech Analytics.
- Navigate to Admin > Speech Analytics > Settings.
- Enable Real-Time Speech Analytics.
- Select the Keyword Group (
Operational_Escalations) created in Step 1.1. - Set the Confidence Threshold to
75%.
The Trap: Setting the Confidence Threshold too low (e.g., 40%).
The Downstream Effect: Under high-noise conditions (background office chatter, poor mic quality), the engine will hallucinate matches. Agents will receive bookmarks for conversations they did not have, destroying trust in the system.
The Architectural Fix: Start with 80% for production environments. Use the Speech Analytics Dashboard to review false positives over a 48-hour period and adjust the threshold downward only after validating the noise floor of your specific recording environment.
2. Architecting the Bookmark Injection via Architect Flow
While Real-Time SA provides the detection, it does not natively inject visual bookmarks into the desktop recording player without configuration. You must bridge the SA event to the recording metadata. This requires a hybrid approach using Architect for logic and API for metadata injection, or leveraging the native Quality Management integration if your use case is purely compliance.
For dynamic, custom bookmarking, we use the Speech Analytics API to push events to the recording index.
Step 2.1: Create the Architect Flow for Event Handling
You cannot simply “turn on” bookmarking. You must define what happens when a keyword is detected.
-
Open Architect.
-
Create a new flow:
SA_Keyword_Bookmark_Handler. -
Add a Start node. Set the trigger to Speech Analytics Event.
-
In the Speech Analytics Event configuration:
- Select Keyword Match.
- Choose the Keyword Group:
Operational_Escalations. - Select Real-Time.
-
Add a Set Variables node immediately after.
- Variable:
bookmark_timestamp={{event.timestamp}} - Variable:
bookmark_label={{event.keyword.name}} - Variable:
recording_id={{event.recording.id}}
- Variable:
The Trap: Using event.timestamp directly without offset adjustment.
The Downstream Effect: The bookmark will point to the exact second the audio stream was processed by the SA engine, not the moment the agent spoke. Due to network latency and processing buffer, this can be 2–5 seconds off.
The Architectural Fix: Genesys Cloud SA events include a confidence and duration. Calculate the midpoint of the detected phrase:
actual_start = event.timestamp - (event.duration / 2)
Use this calculated value for the bookmark anchor.
Step 2.2: Injecting the Bookmark via API
Native Architect does not have a “Add Bookmark” action. You must call the Speech Analytics API to tag the recording segment.
- Add a Make HTTP Request node.
- Configuration:
- Method:
POST - URL:
https://{{env}}.mypurecloud.com/api/v2/analytics/speechanalytics/recordings/{{recording_id}}/annotations - Headers:
Authorization: Bearer {{access_token}}Content-Type: application/json
- Body:
{ "type": "keyword", "keywordId": "{{event.keyword.id}}", "startTime": "{{actual_start}}", "endTime": "{{event.timestamp}}", "confidence": "{{event.confidence}}", "label": "{{bookmark_label}}" }
- Method:
The Trap: Using a Service Account with insufficient scopes for the annotations endpoint.
The Downstream Effect: The API call returns 403 Forbidden, and the bookmark is silently dropped. The flow completes successfully, but no bookmark appears in the UI.
The Architectural Fix: Ensure the Service Account used in the Make HTTP Request node has the scope analytics:speechanalytics:write. Test this endpoint in Postman first using the same credentials.
3. Configuring the Desktop Player Experience
The bookmark exists in the database, but it must be visible to the agent and supervisor.
Step 3.1: Enable Keyword Overlays in the Desktop App
- Navigate to Admin > Settings > Desktop.
- Locate Speech Analytics.
- Enable Show Keyword Highlights in Recordings.
- Enable Real-Time Keyword Alerts.
The Trap: Enabling all keyword groups simultaneously.
The Downstream Effect: The recording timeline becomes a solid block of color. Agents cannot distinguish between critical escalation keywords and minor courtesy phrases.
The Architectural Fix: Create two distinct Keyword Groups: Critical_Escalations and General_Courtesy. Only enable Critical_Escalations for real-time overlay. Use General_Courtesy for post-call analytics only. This reduces cognitive load during live interactions.
Step 3.2: Customizing the Bookmark UI
You can customize how the bookmark appears in the recording player.
- Navigate to Admin > Speech Analytics > Settings.
- Under UI Customization, map the
Operational_Escalationsgroup to a distinct color (e.g., Red). - Set the Bookmark Label Format to
{{keyword.name}} - {{confidence}}%.
This ensures that when a supervisor clicks the bookmark, they see not just the term, but the confidence level, allowing them to quickly assess if the match is reliable.
Validation, Edge Cases & Troubleshooting
Edge Case 1: The “Silent Match” Phenomenon
The Failure Condition: The Architect flow logs a successful API call, but the bookmark does not appear in the desktop recording player.
The Root Cause: The recording has not yet been finalized or indexed. Speech Analytics annotations are applied to the media file, not the live stream. If the recording is still “In Progress,” the annotation queue holds the event until the media is closed.
The Solution: Implement a retry mechanism in the Architect flow. If the API returns 404 Not Found for the recording ID, add a Wait node for 10 seconds and retry up to 3 times. For real-time visibility, rely on the Real-Time Alert toast notification instead of the timeline bookmark for immediate agent feedback.
Edge Case 2: Multi-Channel Discrepancy
The Failure Condition: Keywords are detected in voice calls but not in screen-share audio channels.
The Root Cause: Screen share audio is often mixed with system audio at a lower bitrate or different sample rate. The SA engine may fail to transcribe these channels accurately if the Audio Quality settings are not tuned.
The Solution: Navigate to Admin > Telephony > Audio Settings. Ensure that Screen Share Audio is enabled for transcription. Additionally, increase the Microphone Gain threshold for screen share sources in the Desktop App settings to ensure the audio volume meets the SA engine’s minimum decibel threshold for processing.
Edge Case 3: Cross-Session Context Loss
The Failure Condition: An agent says “I will need to transfer you to billing” in one interaction, but the keyword “Transfer” is not triggered because the SA engine resets context between distinct media segments.
The Root Cause: Speech Analytics processes each recording segment independently. If a call is held and resumed, it may be treated as two separate segments.
The Solution: Use Architect Expressions to maintain state across segments. Store the last_keyword_detected in a User Data attribute. If a new segment starts, check this attribute. If a high-confidence keyword was detected in the previous segment, pre-load a contextual bookmark label. This requires a more complex Architect flow that bridges media segment boundaries.