Retrieving the Full Conversation Transcript (Voice-to-Text) via the Speech and Text Analytics API
What You Will Build
This tutorial demonstrates how to retrieve the complete text transcript of a voice conversation using the Genesys Cloud Speech and Text Analytics API. You will build a Python script that queries for specific conversations, retrieves the associated speech analytics data, and extracts the speaker-separated transcript lines. The solution uses the Genesys Cloud Python SDK (genesys-cloud-purecloud-platform-client) to handle authentication and API calls.
Prerequisites
Before executing the code, ensure you have the following components configured:
- OAuth Client: An OAuth client registered in Genesys Cloud with the following scopes:
analytics:conversation:read(to query for conversations)speechanalytics:transcript:read(to retrieve speech analytics transcripts)
- SDK Version:
genesys-cloud-purecloud-platform-clientversion 118.0.0 or higher. - Runtime: Python 3.8 or higher.
- Dependencies: Install the SDK via pip:
pip install genesys-cloud-purecloud-platform-client
Authentication Setup
The Genesys Cloud Python SDK uses a token manager to handle OAuth 2.0 client credentials flow automatically. You must configure the PureCloudPlatformClientV2 with your client ID, client secret, and environment.
The following code initializes the client. In a production environment, store credentials in environment variables or a secure vault rather than hardcoding them.
import os
from purecloud_platform_client import Configuration, PureCloudPlatformClientV2
def initialize_genesis_client() -> PureCloudPlatformClientV2:
"""
Initializes the Genesys Cloud Platform Client with OAuth credentials.
"""
# Retrieve credentials from environment variables
client_id = os.getenv("GENESYS_CLIENT_ID")
client_secret = os.getenv("GENESYS_CLIENT_SECRET")
environment = os.getenv("GENESYS_ENVIRONMENT", "us-east-1") # e.g., us-east-1, eu-west-1, au-southeast-2
if not client_id or not client_secret:
raise ValueError("GENESYS_CLIENT_ID and GENESYS_CLIENT_SECRET environment variables are required.")
# Create configuration object
config = Configuration(
client_id=client_id,
client_secret=client_secret,
base_path=f"https://{environment}.mygenesys.com/api/v2"
)
# Initialize the platform client
client = PureCloudPlatformClientV2(config)
# The SDK automatically handles token acquisition and refresh
return client
Implementation
Step 1: Query for Conversations
To retrieve a transcript, you first need the conversationId. The Speech Analytics API does not provide a direct “list all transcripts” endpoint without filtering. You must query the Analytics API for conversations that match your criteria (e.g., date range, specific user, or routing queue).
We will use the AnalyticsApi to search for voice conversations. The endpoint post_analytics_conversations_details_query allows for complex filtering.
from purecloud_platform_client import AnalyticsApi
from purecloud_platform_client.rest import ApiException
from purecloud_platform_client.models import ConversationDetailsQuery
def find_recent_voice_conversation(client: PureCloudPlatformClientV2) -> str:
"""
Queries for the most recent voice conversation.
Returns the conversation ID or None if no conversation is found.
"""
analytics_api = AnalyticsApi(client)
# Define the query body
query_body = ConversationDetailsQuery(
view="conversation",
interval="2023-10-01T00:00:00Z/2023-10-31T23:59:59Z", # Adjust date range as needed
entity=[
{
"type": "conversation",
"id": None, # We want all conversations in this range
"filter": {
"type": "equals",
"path": "mediaType",
"value": "voice"
}
}
],
select=[
"conversationId",
"mediaType",
"startTime",
"endTime"
],
size=1 # Limit to 1 result for this example
)
try:
# Execute the query
response = analytics_api.post_analytics_conversations_details_query(body=query_body)
if response.entities and len(response.entities) > 0:
return response.entities[0].conversation_id
else:
print("No voice conversations found in the specified interval.")
return None
except ApiException as e:
print(f"Exception when calling AnalyticsApi->post_analytics_conversations_details_query: {e}")
raise
Key Parameters:
view: Set to"conversation"to retrieve conversation-level data.interval: An ISO 8601 time interval. Ensure the start and end times are valid.entity.filter: Filters formediaTypeequal to"voice"to exclude chat or email.size: Limits the number of results returned. For pagination, increase this number or use thenextPagetoken from the response.
Step 2: Retrieve Speech Analytics Transcript
Once you have the conversationId, you can query the Speech Analytics API. The endpoint post_speechanalytics_search is used to retrieve transcripts. You must specify the conversationId in the query body.
The response contains a results array, where each result represents a segment of the transcript. The transcript field within each result contains the text.
from purecloud_platform_client import SpeechAnalyticsApi
from purecloud_platform_client.models import SpeechAnalyticsSearch
def get_transcript_for_conversation(client: PureCloudPlatformClientV2, conversation_id: str) -> list:
"""
Retrieves the speech analytics transcript for a given conversation ID.
Returns a list of transcript segments.
"""
speech_api = SpeechAnalyticsApi(client)
# Define the search body
search_body = SpeechAnalyticsSearch(
entity=[
{
"type": "conversation",
"id": conversation_id
}
],
view="transcript", # Critical: Must be 'transcript' to get text
select=[
"conversationId",
"transcript",
"speaker",
"startTime",
"endTime"
],
size=1000 # Maximize results per page
)
try:
# Execute the search
response = speech_api.post_speechanalytics_search(body=search_body)
# Extract transcript data
transcript_segments = []
if response.results:
for result in response.results:
if result.transcript:
transcript_segments.append({
"speaker": result.speaker,
"text": result.transcript,
"start_time": result.start_time,
"end_time": result.end_time
})
return transcript_segments
except ApiException as e:
# Handle specific errors
if e.status == 404:
print(f"No speech analytics data found for conversation {conversation_id}.")
return []
elif e.status == 403:
print(f"Access denied. Ensure the OAuth client has 'speechanalytics:transcript:read' scope.")
return []
else:
print(f"Exception when calling SpeechAnalyticsApi->post_speechanalytics_search: {e}")
raise
Key Parameters:
view: Must be set to"transcript". Other views (e.g.,"summary") do not return full text.select: Include"transcript"to get the text. Include"speaker"to identify who spoke.entity: Must specify"type": "conversation"and the actual"id".
OAuth Scope Requirement:
The calling OAuth client must have the speechanalytics:transcript:read scope. Without this, the API returns a 403 Forbidden error.
Step 3: Processing and Formatting Results
The raw API response returns transcript segments in chronological order. Each segment includes the speaker identifier (e.g., "agent", "customer", or a specific name if available) and the text.
The following function formats these segments into a readable transcript string.
def format_transcript(transcript_segments: list) -> str:
"""
Formats a list of transcript segments into a human-readable string.
"""
if not transcript_segments:
return "No transcript available."
lines = []
for segment in transcript_segments:
speaker = segment.get("speaker", "Unknown")
text = segment.get("text", "")
# Clean up speaker name for display
if speaker == "agent":
speaker_display = "Agent"
elif speaker == "customer":
speaker_display = "Customer"
else:
speaker_display = speaker.capitalize()
lines.append(f"[{speaker_display}]: {text}")
return "\n".join(lines)
Complete Working Example
The following script combines all steps into a single executable module. It initializes the client, finds a recent voice conversation, retrieves its transcript, and prints the formatted output.
import os
import sys
from purecloud_platform_client import Configuration, PureCloudPlatformClientV2, AnalyticsApi, SpeechAnalyticsApi
from purecloud_platform_client.rest import ApiException
from purecloud_platform_client.models import ConversationDetailsQuery, SpeechAnalyticsSearch
def initialize_genesis_client() -> PureCloudPlatformClientV2:
"""
Initializes the Genesys Cloud Platform Client with OAuth credentials.
"""
client_id = os.getenv("GENESYS_CLIENT_ID")
client_secret = os.getenv("GENESYS_CLIENT_SECRET")
environment = os.getenv("GENESYS_ENVIRONMENT", "us-east-1")
if not client_id or not client_secret:
raise ValueError("GENESYS_CLIENT_ID and GENESYS_CLIENT_SECRET environment variables are required.")
config = Configuration(
client_id=client_id,
client_secret=client_secret,
base_path=f"https://{environment}.mygenesys.com/api/v2"
)
return PureCloudPlatformClientV2(config)
def find_recent_voice_conversation(client: PureCloudPlatformClientV2) -> str:
"""
Queries for the most recent voice conversation.
"""
analytics_api = AnalyticsApi(client)
query_body = ConversationDetailsQuery(
view="conversation",
interval="2023-10-01T00:00:00Z/2023-10-31T23:59:59Z",
entity=[
{
"type": "conversation",
"id": None,
"filter": {
"type": "equals",
"path": "mediaType",
"value": "voice"
}
}
],
select=["conversationId"],
size=1
)
try:
response = analytics_api.post_analytics_conversations_details_query(body=query_body)
if response.entities and len(response.entities) > 0:
return response.entities[0].conversation_id
return None
except ApiException as e:
print(f"Error querying conversations: {e}")
raise
def get_transcript_for_conversation(client: PureCloudPlatformClientV2, conversation_id: str) -> list:
"""
Retrieves the speech analytics transcript for a given conversation ID.
"""
speech_api = SpeechAnalyticsApi(client)
search_body = SpeechAnalyticsSearch(
entity=[
{
"type": "conversation",
"id": conversation_id
}
],
view="transcript",
select=["transcript", "speaker", "startTime", "endTime"],
size=1000
)
try:
response = speech_api.post_speechanalytics_search(body=search_body)
transcript_segments = []
if response.results:
for result in response.results:
if result.transcript:
transcript_segments.append({
"speaker": result.speaker,
"text": result.transcript,
"start_time": result.start_time,
"end_time": result.end_time
})
return transcript_segments
except ApiException as e:
if e.status == 404:
print(f"No speech analytics data found for conversation {conversation_id}.")
elif e.status == 403:
print("Access denied. Check OAuth scopes.")
else:
print(f"Error retrieving transcript: {e}")
return []
def format_transcript(transcript_segments: list) -> str:
"""
Formats transcript segments into a readable string.
"""
if not transcript_segments:
return "No transcript available."
lines = []
for segment in transcript_segments:
speaker = segment.get("speaker", "Unknown")
text = segment.get("text", "")
speaker_display = "Agent" if speaker == "agent" else "Customer" if speaker == "customer" else speaker.capitalize()
lines.append(f"[{speaker_display}]: {text}")
return "\n".join(lines)
def main():
try:
# 1. Initialize Client
client = initialize_genesis_client()
print("Client initialized successfully.")
# 2. Find a Conversation
conversation_id = find_recent_voice_conversation(client)
if not conversation_id:
print("No recent voice conversations found.")
return
print(f"Found conversation ID: {conversation_id}")
# 3. Retrieve Transcript
transcript_segments = get_transcript_for_conversation(client, conversation_id)
if not transcript_segments:
print("No transcript segments retrieved.")
return
# 4. Format and Print
formatted_transcript = format_transcript(transcript_segments)
print("\n--- Transcript ---")
print(formatted_transcript)
print("--- End Transcript ---")
except Exception as e:
print(f"Fatal error: {e}")
sys.exit(1)
if __name__ == "__main__":
main()
Common Errors & Debugging
Error: 403 Forbidden on Speech Analytics Search
Cause: The OAuth client used for authentication lacks the speechanalytics:transcript:read scope.
Fix:
- Log in to the Genesys Cloud Admin console.
- Navigate to Developers > OAuth 2.0 Clients.
- Select your client.
- Edit the scopes and add
speechanalytics:transcript:read. - Save the changes. The SDK will automatically use the new token scopes on the next request.
Error: 404 Not Found
Cause: The conversationId provided does not exist, or the conversation does not have associated speech analytics data. This often happens if:
- The conversation is too recent (speech analytics processing is asynchronous and may take a few minutes).
- The conversation was excluded from speech analytics by your organization’s policies.
- The media type was not voice (e.g., chat or email).
Fix:
- Verify the
conversationIdexists via theGET /api/v2/interactions/conversations/{conversationId}endpoint. - Check the Speech Analytics settings in Admin to ensure voice conversations are being processed.
- Add a small delay (e.g.,
time.sleep(60)) after the conversation ends before querying for the transcript.
Error: Empty Transcript Results
Cause: The view parameter in SpeechAnalyticsSearch was not set to "transcript", or the select array did not include "transcript".
Fix:
Ensure the SpeechAnalyticsSearch body explicitly sets:
view="transcript"
select=["transcript", "speaker"]
Error: Rate Limiting (429 Too Many Requests)
Cause: The API enforces rate limits per OAuth client. Exceeding these limits results in a 429 response.
Fix:
Implement exponential backoff in your code. The Python SDK does not automatically retry 429 errors for all endpoints, so manual handling is required.
import time
import random
def retry_on_429(func, *args, **kwargs):
max_retries = 3
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except ApiException as e:
if e.status == 429:
wait_time = (2 ** attempt) + random.uniform(0, 1)
print(f"Rate limited. Waiting {wait_time:.2f} seconds...")
time.sleep(wait_time)
else:
raise
raise Exception("Max retries exceeded for 429 error.")