Retrieving the Full Voice Conversation Transcript via Genesys Cloud Speech Analytics
What You Will Build
- This tutorial demonstrates how to programmatically retrieve the full text transcript of a voice conversation using the Genesys Cloud Speech and Text Analytics API.
- The solution utilizes the
GET /api/v2/analytics/conversations/details/queryendpoint combined with the Speech Analytics SDK to extract transcribed segments. - The implementation is provided in Python using the official
genesyscloudSDK.
Prerequisites
- OAuth Client Type: A Genesys Cloud OAuth Client ID and Secret with the
publicorconfidentialgrant type. - Required Scopes: The token must include the
analytics:conversation:viewscope to access conversation details and transcripts. - SDK Version:
genesyscloudPython SDK version 140.0.0 or later. - Runtime: Python 3.8 or higher.
- Dependencies: Install the SDK via pip:
pip install genesyscloud
Authentication Setup
Genesys Cloud APIs require OAuth 2.0 authentication. The most robust method for server-side integrations is the Client Credentials flow. This flow exchanges your Client ID and Secret for an access token.
The following code initializes the PureCloud platform client. This handles the token acquisition, caching, and automatic refresh logic internally.
from genesyscloud.platform_client import PlatformClient
def get_platform_client(client_id: str, client_secret: str, base_url: str = "https://api.mypurecloud.com"):
"""
Initializes and returns a configured Genesys Cloud Platform Client.
"""
# Create the platform client instance
platform_client = PlatformClient(
client_id=client_id,
client_secret=client_secret,
base_url=base_url
)
# Configure the client to use the default OAuth flow
# The SDK automatically manages token caching and refreshing
platform_client.init_oauth()
return platform_client
Critical Note: If you receive a 401 Unauthorized error, verify that your OAuth Client has the analytics:conversation:view scope assigned in the Genesys Cloud Admin Console under Admin > Integration > OAuth Clients.
Implementation
Retrieving a transcript is not a single API call. The Genesys Cloud architecture separates the conversation index (metadata) from the conversation details (transcripts, recordings, interactions). To get the transcript, you must:
- Query the Conversation Details API to find the specific conversation ID or retrieve a batch of conversations.
- Extract the
transcriptsarray from the response. - Process the transcript segments to reconstruct the full text.
Step 1: Constructing the Conversation Details Query
The GET /api/v2/analytics/conversations/details/query endpoint accepts a complex JSON body. This body defines the time range, the conversation type (voice), and the specific data you want to return.
To retrieve transcripts, you must ensure the transcripts field is included in the fields parameter. If you omit this, the API will return metadata but no text.
import json
from datetime import datetime, timezone
def build_conversation_query(start_time: str, end_time: str, limit: int = 10):
"""
Builds the JSON payload for the Conversation Details Query.
Args:
start_time: ISO 8601 start time (e.g., "2023-10-01T00:00:00Z")
end_time: ISO 8601 end time (e.g., "2023-10-02T00:00:00Z")
limit: Maximum number of conversations to return per page
Returns:
dict: The query payload
"""
query_payload = {
"interval": f"{start_time}/{end_time}",
"groupBy": ["conversation"],
"filter": [
{
"type": "equals",
"path": "type",
"value": "voice"
}
],
"aggregations": [],
"fields": [
"id",
"type",
"startTime",
"endTime",
"transcripts", # Critical: Must include this to get text
"participants"
],
"size": limit
}
return query_payload
Step 2: Executing the Query and Handling Pagination
The Conversation Details API supports pagination via the nextPageToken. If you are querying a large volume of conversations, you must loop until no nextPageToken is returned.
For this tutorial, we will implement a simple fetch that retrieves the first page. In production, wrap the api.get_analytics_conversations_details_query call in a while loop checking for response.next_page_token.
from genesyscloud.platform_client import PlatformClient
from genesyscloud.analytics.api.conversations_api import ConversationsApi
def fetch_conversation_transcripts(platform_client: PlatformClient, start_time: str, end_time: str):
"""
Fetches conversation details including transcripts.
Args:
platform_client: Authenticated PlatformClient instance
start_time: ISO 8601 start time
end_time: ISO 8601 end time
Returns:
list: A list of conversation objects containing transcripts
"""
conversations_api = ConversationsApi(platform_client)
# Build the query body
query_body = build_conversation_query(start_time, end_time, limit=5)
try:
# Execute the query
# Note: The SDK method is get_analytics_conversations_details_query
response = conversations_api.get_analytics_conversations_details_query(
body=query_body
)
# Check if the response contains entities
if response.entities:
return response.entities
else:
print("No conversations found in the specified time range.")
return []
except Exception as e:
# Handle API errors (400, 401, 429, 500)
print(f"Error fetching conversations: {e}")
raise e
Step 3: Processing the Transcript Data Structure
The transcripts field in the response is an array of objects. Each object represents a segment of the conversation. Voice transcripts are typically split into segments based on speaker turns or silence gaps.
Each transcript segment contains:
text: The actual transcribed text.speaker: The role of the speaker (e.g.,agent,customer,system).start: The timestamp of the segment start.duration: The length of the segment in milliseconds.
You must iterate through these segments to reconstruct the full narrative.
def extract_full_transcript(conversation_entity: dict) -> str:
"""
Extracts and concatenates the transcript text from a conversation entity.
Args:
conversation_entity: A single conversation object from the API response
Returns:
str: The full transcript with speaker labels
"""
transcripts = conversation_entity.get("transcripts", [])
if not transcripts:
return "No transcript available for this conversation."
full_text_lines = []
for segment in transcripts:
speaker = segment.get("speaker", "Unknown")
text = segment.get("text", "").strip()
# Skip empty segments
if not text:
continue
# Format: [Speaker]: Text
full_text_lines.append(f"[{speaker.upper()}]: {text}")
# Join lines with newlines for readability
return "\n".join(full_text_lines)
Complete Working Example
The following script combines authentication, querying, and processing into a single runnable module. Replace YOUR_CLIENT_ID and YOUR_CLIENT_SECRET with your actual credentials.
import os
import sys
from datetime import datetime, timezone, timedelta
from genesyscloud.platform_client import PlatformClient
from genesyscloud.analytics.api.conversations_api import ConversationsApi
def get_platform_client(client_id: str, client_secret: str):
"""Initializes the Genesys Cloud Platform Client."""
platform_client = PlatformClient(
client_id=client_id,
client_secret=client_secret,
base_url="https://api.mypurecloud.com"
)
platform_client.init_oauth()
return platform_client
def build_conversation_query(start_time: str, end_time: str):
"""Builds the JSON payload for the Conversation Details Query."""
return {
"interval": f"{start_time}/{end_time}",
"groupBy": ["conversation"],
"filter": [
{
"type": "equals",
"path": "type",
"value": "voice"
}
],
"aggregations": [],
"fields": [
"id",
"type",
"startTime",
"endTime",
"transcripts",
"participants"
],
"size": 5
}
def extract_full_transcript(conversation_entity: dict) -> str:
"""Extracts and concatenates the transcript text from a conversation entity."""
transcripts = conversation_entity.get("transcripts", [])
if not transcripts:
return "No transcript available for this conversation."
full_text_lines = []
for segment in transcripts:
speaker = segment.get("speaker", "Unknown")
text = segment.get("text", "").strip()
if not text:
continue
full_text_lines.append(f"[{speaker.upper()}]: {text}")
return "\n".join(full_text_lines)
def main():
# 1. Setup Authentication
client_id = os.getenv("GENESYS_CLIENT_ID", "YOUR_CLIENT_ID")
client_secret = os.getenv("GENESYS_CLIENT_SECRET", "YOUR_CLIENT_SECRET")
if client_id == "YOUR_CLIENT_ID":
print("Error: Please set GENESYS_CLIENT_ID and GENESYS_CLIENT_SECRET environment variables.")
sys.exit(1)
platform_client = get_platform_client(client_id, client_secret)
# 2. Define Time Range (Last 24 Hours)
end_time = datetime.now(timezone.utc)
start_time = end_time - timedelta(days=1)
start_iso = start_time.strftime("%Y-%m-%dT%H:%M:%SZ")
end_iso = end_time.strftime("%Y-%m-%dT%H:%M:%SZ")
print(f"Querying conversations from {start_iso} to {end_iso}...")
# 3. Execute Query
conversations_api = ConversationsApi(platform_client)
query_body = build_conversation_query(start_iso, end_iso)
try:
response = conversations_api.get_analytics_conversations_details_query(body=query_body)
if not response.entities:
print("No voice conversations found in the last 24 hours.")
return
print(f"Found {len(response.entities)} conversation(s).\n")
# 4. Process and Print Transcripts
for conv in response.entities:
conv_id = conv.get("id", "Unknown ID")
start = conv.get("startTime", "Unknown Time")
print(f"--- Conversation ID: {conv_id} (Started: {start}) ---")
transcript_text = extract_full_transcript(conv)
print(transcript_text)
print("-" * 50)
except Exception as e:
print(f"Failed to fetch conversations: {e}")
sys.exit(1)
if __name__ == "__main__":
main()
Common Errors & Debugging
Error: 401 Unauthorized or 403 Forbidden
Cause:
The OAuth token does not have the required analytics:conversation:view scope, or the Client ID/Secret is incorrect.
Fix:
- Log in to the Genesys Cloud Admin Console.
- Navigate to Admin > Integration > OAuth Clients.
- Select your client.
- Under the Scopes tab, ensure
analytics:conversation:viewis checked. - Save the changes. The SDK will automatically use the new scope on the next token refresh.
Error: Transcripts field is missing or empty
Cause:
- The conversation type is not
voice. The query filter restricts results tovoice, but if you remove the filter, you may getchatoremailconversations which do not have speech transcripts. - Speech Analytics is not enabled for the organization or the specific queue.
- The conversation is too recent. Speech Analytics processing is asynchronous. It can take several minutes to an hour for transcripts to appear after the conversation ends.
Fix:
- Verify the conversation ended at least 15 minutes ago.
- Check the Speech Analytics settings in Admin to ensure transcription is enabled.
- Confirm the
fieldsarray in your query payload explicitly includes"transcripts".
Error: 429 Too Many Requests
Cause:
You are hitting the API rate limit. The Conversation Details API has a strict rate limit.
Fix:
Implement exponential backoff. The Genesys Cloud Python SDK includes built-in retry logic, but you can also manually handle it.
import time
import random
def api_call_with_retry(api_func, *args, max_retries=3, **kwargs):
for attempt in range(max_retries):
try:
return api_func(*args, **kwargs)
except Exception as e:
if "429" in str(e):
wait_time = (2 ** attempt) + random.uniform(0, 1)
print(f"Rate limited. Retrying in {wait_time:.2f} seconds...")
time.sleep(wait_time)
else:
raise e
raise Exception("Max retries exceeded")