Retrieving the Full Conversation Transcript (Voice-to-Text) via the Speech and Text Analytics API

StarAdmin · April 3, 2026, 9:00am

Retrieving the Full Conversation Transcript (Voice-to-Text) via the Speech and Text Analytics API

What You Will Build

This tutorial demonstrates how to retrieve the complete text transcript of a voice conversation using the Genesys Cloud Speech and Text Analytics API. You will build a Python script that queries for specific conversations, retrieves the associated speech analytics data, and extracts the speaker-separated transcript lines. The solution uses the Genesys Cloud Python SDK (genesys-cloud-purecloud-platform-client) to handle authentication and API calls.

Prerequisites

Before executing the code, ensure you have the following components configured:

OAuth Client: An OAuth client registered in Genesys Cloud with the following scopes:
- analytics:conversation:read (to query for conversations)
- speechanalytics:transcript:read (to retrieve speech analytics transcripts)
SDK Version: genesys-cloud-purecloud-platform-client version 118.0.0 or higher.
Runtime: Python 3.8 or higher.

Dependencies: Install the SDK via pip:

pip install genesys-cloud-purecloud-platform-client

Authentication Setup

The Genesys Cloud Python SDK uses a token manager to handle OAuth 2.0 client credentials flow automatically. You must configure the PureCloudPlatformClientV2 with your client ID, client secret, and environment.

The following code initializes the client. In a production environment, store credentials in environment variables or a secure vault rather than hardcoding them.

import os
from purecloud_platform_client import Configuration, PureCloudPlatformClientV2

def initialize_genesis_client() -> PureCloudPlatformClientV2:
    """
    Initializes the Genesys Cloud Platform Client with OAuth credentials.
    """
    # Retrieve credentials from environment variables
    client_id = os.getenv("GENESYS_CLIENT_ID")
    client_secret = os.getenv("GENESYS_CLIENT_SECRET")
    environment = os.getenv("GENESYS_ENVIRONMENT", "us-east-1") # e.g., us-east-1, eu-west-1, au-southeast-2

    if not client_id or not client_secret:
        raise ValueError("GENESYS_CLIENT_ID and GENESYS_CLIENT_SECRET environment variables are required.")

    # Create configuration object
    config = Configuration(
        client_id=client_id,
        client_secret=client_secret,
        base_path=f"https://{environment}.mygenesys.com/api/v2"
    )

    # Initialize the platform client
    client = PureCloudPlatformClientV2(config)
    
    # The SDK automatically handles token acquisition and refresh
    return client

Implementation

Step 1: Query for Conversations

To retrieve a transcript, you first need the conversationId. The Speech Analytics API does not provide a direct “list all transcripts” endpoint without filtering. You must query the Analytics API for conversations that match your criteria (e.g., date range, specific user, or routing queue).

We will use the AnalyticsApi to search for voice conversations. The endpoint post_analytics_conversations_details_query allows for complex filtering.

from purecloud_platform_client import AnalyticsApi
from purecloud_platform_client.rest import ApiException
from purecloud_platform_client.models import ConversationDetailsQuery

def find_recent_voice_conversation(client: PureCloudPlatformClientV2) -> str:
    """
    Queries for the most recent voice conversation.
    Returns the conversation ID or None if no conversation is found.
    """
    analytics_api = AnalyticsApi(client)
    
    # Define the query body
    query_body = ConversationDetailsQuery(
        view="conversation",
        interval="2023-10-01T00:00:00Z/2023-10-31T23:59:59Z", # Adjust date range as needed
        entity=[
            {
                "type": "conversation",
                "id": None, # We want all conversations in this range
                "filter": {
                    "type": "equals",
                    "path": "mediaType",
                    "value": "voice"
                }
            }
        ],
        select=[
            "conversationId",
            "mediaType",
            "startTime",
            "endTime"
        ],
        size=1 # Limit to 1 result for this example
    )

    try:
        # Execute the query
        response = analytics_api.post_analytics_conversations_details_query(body=query_body)
        
        if response.entities and len(response.entities) > 0:
            return response.entities[0].conversation_id
        else:
            print("No voice conversations found in the specified interval.")
            return None
    except ApiException as e:
        print(f"Exception when calling AnalyticsApi->post_analytics_conversations_details_query: {e}")
        raise

Key Parameters:

view: Set to "conversation" to retrieve conversation-level data.
interval: An ISO 8601 time interval. Ensure the start and end times are valid.
entity.filter: Filters for mediaType equal to "voice" to exclude chat or email.
size: Limits the number of results returned. For pagination, increase this number or use the nextPage token from the response.

Step 2: Retrieve Speech Analytics Transcript

Once you have the conversationId, you can query the Speech Analytics API. The endpoint post_speechanalytics_search is used to retrieve transcripts. You must specify the conversationId in the query body.

The response contains a results array, where each result represents a segment of the transcript. The transcript field within each result contains the text.

from purecloud_platform_client import SpeechAnalyticsApi
from purecloud_platform_client.models import SpeechAnalyticsSearch

def get_transcript_for_conversation(client: PureCloudPlatformClientV2, conversation_id: str) -> list:
    """
    Retrieves the speech analytics transcript for a given conversation ID.
    Returns a list of transcript segments.
    """
    speech_api = SpeechAnalyticsApi(client)
    
    # Define the search body
    search_body = SpeechAnalyticsSearch(
        entity=[
            {
                "type": "conversation",
                "id": conversation_id
            }
        ],
        view="transcript", # Critical: Must be 'transcript' to get text
        select=[
            "conversationId",
            "transcript",
            "speaker",
            "startTime",
            "endTime"
        ],
        size=1000 # Maximize results per page
    )

    try:
        # Execute the search
        response = speech_api.post_speechanalytics_search(body=search_body)
        
        # Extract transcript data
        transcript_segments = []
        if response.results:
            for result in response.results:
                if result.transcript:
                    transcript_segments.append({
                        "speaker": result.speaker,
                        "text": result.transcript,
                        "start_time": result.start_time,
                        "end_time": result.end_time
                    })
        return transcript_segments
    
    except ApiException as e:
        # Handle specific errors
        if e.status == 404:
            print(f"No speech analytics data found for conversation {conversation_id}.")
            return []
        elif e.status == 403:
            print(f"Access denied. Ensure the OAuth client has 'speechanalytics:transcript:read' scope.")
            return []
        else:
            print(f"Exception when calling SpeechAnalyticsApi->post_speechanalytics_search: {e}")
            raise

Key Parameters:

view: Must be set to "transcript". Other views (e.g., "summary") do not return full text.
select: Include "transcript" to get the text. Include "speaker" to identify who spoke.
entity: Must specify "type": "conversation" and the actual "id".

OAuth Scope Requirement:
The calling OAuth client must have the speechanalytics:transcript:read scope. Without this, the API returns a 403 Forbidden error.

Step 3: Processing and Formatting Results

The raw API response returns transcript segments in chronological order. Each segment includes the speaker identifier (e.g., "agent", "customer", or a specific name if available) and the text.

The following function formats these segments into a readable transcript string.

def format_transcript(transcript_segments: list) -> str:
    """
    Formats a list of transcript segments into a human-readable string.
    """
    if not transcript_segments:
        return "No transcript available."
    
    lines = []
    for segment in transcript_segments:
        speaker = segment.get("speaker", "Unknown")
        text = segment.get("text", "")
        
        # Clean up speaker name for display
        if speaker == "agent":
            speaker_display = "Agent"
        elif speaker == "customer":
            speaker_display = "Customer"
        else:
            speaker_display = speaker.capitalize()
            
        lines.append(f"[{speaker_display}]: {text}")
    
    return "\n".join(lines)

Complete Working Example

The following script combines all steps into a single executable module. It initializes the client, finds a recent voice conversation, retrieves its transcript, and prints the formatted output.

import os
import sys
from purecloud_platform_client import Configuration, PureCloudPlatformClientV2, AnalyticsApi, SpeechAnalyticsApi
from purecloud_platform_client.rest import ApiException
from purecloud_platform_client.models import ConversationDetailsQuery, SpeechAnalyticsSearch

def initialize_genesis_client() -> PureCloudPlatformClientV2:
    """
    Initializes the Genesys Cloud Platform Client with OAuth credentials.
    """
    client_id = os.getenv("GENESYS_CLIENT_ID")
    client_secret = os.getenv("GENESYS_CLIENT_SECRET")
    environment = os.getenv("GENESYS_ENVIRONMENT", "us-east-1")

    if not client_id or not client_secret:
        raise ValueError("GENESYS_CLIENT_ID and GENESYS_CLIENT_SECRET environment variables are required.")

    config = Configuration(
        client_id=client_id,
        client_secret=client_secret,
        base_path=f"https://{environment}.mygenesys.com/api/v2"
    )
    return PureCloudPlatformClientV2(config)

def find_recent_voice_conversation(client: PureCloudPlatformClientV2) -> str:
    """
    Queries for the most recent voice conversation.
    """
    analytics_api = AnalyticsApi(client)
    
    query_body = ConversationDetailsQuery(
        view="conversation",
        interval="2023-10-01T00:00:00Z/2023-10-31T23:59:59Z",
        entity=[
            {
                "type": "conversation",
                "id": None,
                "filter": {
                    "type": "equals",
                    "path": "mediaType",
                    "value": "voice"
                }
            }
        ],
        select=["conversationId"],
        size=1
    )

    try:
        response = analytics_api.post_analytics_conversations_details_query(body=query_body)
        if response.entities and len(response.entities) > 0:
            return response.entities[0].conversation_id
        return None
    except ApiException as e:
        print(f"Error querying conversations: {e}")
        raise

def get_transcript_for_conversation(client: PureCloudPlatformClientV2, conversation_id: str) -> list:
    """
    Retrieves the speech analytics transcript for a given conversation ID.
    """
    speech_api = SpeechAnalyticsApi(client)
    
    search_body = SpeechAnalyticsSearch(
        entity=[
            {
                "type": "conversation",
                "id": conversation_id
            }
        ],
        view="transcript",
        select=["transcript", "speaker", "startTime", "endTime"],
        size=1000
    )

    try:
        response = speech_api.post_speechanalytics_search(body=search_body)
        
        transcript_segments = []
        if response.results:
            for result in response.results:
                if result.transcript:
                    transcript_segments.append({
                        "speaker": result.speaker,
                        "text": result.transcript,
                        "start_time": result.start_time,
                        "end_time": result.end_time
                    })
        return transcript_segments
    
    except ApiException as e:
        if e.status == 404:
            print(f"No speech analytics data found for conversation {conversation_id}.")
        elif e.status == 403:
            print("Access denied. Check OAuth scopes.")
        else:
            print(f"Error retrieving transcript: {e}")
        return []

def format_transcript(transcript_segments: list) -> str:
    """
    Formats transcript segments into a readable string.
    """
    if not transcript_segments:
        return "No transcript available."
    
    lines = []
    for segment in transcript_segments:
        speaker = segment.get("speaker", "Unknown")
        text = segment.get("text", "")
        
        speaker_display = "Agent" if speaker == "agent" else "Customer" if speaker == "customer" else speaker.capitalize()
        lines.append(f"[{speaker_display}]: {text}")
    
    return "\n".join(lines)

def main():
    try:
        # 1. Initialize Client
        client = initialize_genesis_client()
        print("Client initialized successfully.")

        # 2. Find a Conversation
        conversation_id = find_recent_voice_conversation(client)
        if not conversation_id:
            print("No recent voice conversations found.")
            return
        
        print(f"Found conversation ID: {conversation_id}")

        # 3. Retrieve Transcript
        transcript_segments = get_transcript_for_conversation(client, conversation_id)
        
        if not transcript_segments:
            print("No transcript segments retrieved.")
            return

        # 4. Format and Print
        formatted_transcript = format_transcript(transcript_segments)
        print("\n--- Transcript ---")
        print(formatted_transcript)
        print("--- End Transcript ---")

    except Exception as e:
        print(f"Fatal error: {e}")
        sys.exit(1)

if __name__ == "__main__":
    main()

Common Errors & Debugging

Error: 403 Forbidden on Speech Analytics Search

Cause: The OAuth client used for authentication lacks the speechanalytics:transcript:read scope.

Fix:

Log in to the Genesys Cloud Admin console.
Navigate to Developers > OAuth 2.0 Clients.
Select your client.
Edit the scopes and add speechanalytics:transcript:read.
Save the changes. The SDK will automatically use the new token scopes on the next request.

Error: 404 Not Found

Cause: The conversationId provided does not exist, or the conversation does not have associated speech analytics data. This often happens if:

The conversation is too recent (speech analytics processing is asynchronous and may take a few minutes).
The conversation was excluded from speech analytics by your organization’s policies.
The media type was not voice (e.g., chat or email).

Fix:

Verify the conversationId exists via the GET /api/v2/interactions/conversations/{conversationId} endpoint.
Check the Speech Analytics settings in Admin to ensure voice conversations are being processed.
Add a small delay (e.g., time.sleep(60)) after the conversation ends before querying for the transcript.

Error: Empty Transcript Results

Cause: The view parameter in SpeechAnalyticsSearch was not set to "transcript", or the select array did not include "transcript".

Fix:
Ensure the SpeechAnalyticsSearch body explicitly sets:

view="transcript"
select=["transcript", "speaker"]

Error: Rate Limiting (429 Too Many Requests)

Cause: The API enforces rate limits per OAuth client. Exceeding these limits results in a 429 response.

Fix:
Implement exponential backoff in your code. The Python SDK does not automatically retry 429 errors for all endpoints, so manual handling is required.

import time
import random

def retry_on_429(func, *args, **kwargs):
    max_retries = 3
    for attempt in range(max_retries):
        try:
            return func(*args, **kwargs)
        except ApiException as e:
            if e.status == 429:
                wait_time = (2 ** attempt) + random.uniform(0, 1)
                print(f"Rate limited. Waiting {wait_time:.2f} seconds...")
                time.sleep(wait_time)
            else:
                raise
    raise Exception("Max retries exceeded for 429 error.")

Retrieving the Full Conversation Transcript (Voice-to-Text) via the Speech and Text Analytics API

Retrieving the Full Conversation Transcript (Voice-to-Text) via the Speech and Text Analytics API

What You Will Build

Prerequisites

Authentication Setup

Implementation

Step 1: Query for Conversations

Step 2: Retrieve Speech Analytics Transcript

Step 3: Processing and Formatting Results

Complete Working Example

Common Errors & Debugging

Error: 403 Forbidden on Speech Analytics Search

Error: 404 Not Found

Error: Empty Transcript Results

Error: Rate Limiting (429 Too Many Requests)

Official References