Paginating the /api/v2/analytics/conversations/details/query endpoint — cursor vs page-based
What You Will Build
- A Python script that retrieves historical conversation details from Genesys Cloud CX using the Analytics API.
- The script demonstrates how to handle pagination using the
pageSizeandpageNumberparameters, which is the standard mechanism for this specific endpoint. - The tutorial covers authentication, request construction, error handling, and robust pagination logic in Python.
Prerequisites
- OAuth Client Type: Client Credentials Grant.
- Required Scopes:
analytics:conversation:viewis mandatory for reading conversation details. - SDK Version: Genesys Cloud Python SDK (
genesys-cloud-purecloud-platform-client) version 160.0.0 or later. - Language/Runtime: Python 3.8+.
- External Dependencies:
genesys-cloud-purecloud-platform-clientpython-dotenv(for secure credential management)requests(used in the raw HTTP example for clarity)
Authentication Setup
Genesys Cloud APIs use OAuth 2.0. For server-side integrations like data extraction, the Client Credentials flow is the standard. You must store your Client ID, Client Secret, and Environment (e.g., us-east-1, eu-west-1) securely.
The following code uses the official SDK to handle token acquisition and caching. The SDK automatically manages token refresh if the token expires during a long-running pagination loop.
import os
from purecloud_platform_client import (
ApiClient,
Configuration,
PureCloudAuthFlow,
AnalyticsApi
)
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
def get_analytics_api_instance():
"""
Initializes the Analytics API client with OAuth authentication.
"""
configuration = Configuration()
configuration.host = f"https://api.{os.getenv('GENESYS_ENV', 'us-east-1')}.mygenesys.com"
# Set up client credentials
configuration.client_id = os.getenv('GENESYS_CLIENT_ID')
configuration.client_secret = os.getenv('GENESYS_CLIENT_SECRET')
# Configure OAuth flow
configuration.oauth_config = {
'flow': PureCloudAuthFlow.CLIENT_CREDENTIALS,
'scopes': ['analytics:conversation:view']
}
# Create the API client
client = ApiClient(configuration=configuration)
# Initialize the Analytics API
analytics_api = AnalyticsApi(client)
return analytics_api, client
Note on Scopes: If you receive a 403 Forbidden error, verify that your OAuth client in the Genesys Cloud Admin Console has the analytics:conversation:view scope assigned.
Implementation
Step 1: Constructing the Query Body
The /api/v2/analytics/conversations/details/query endpoint uses a POST request with a JSON body. This allows for complex filtering by date range, queue, user, or disposition. Unlike simple GET endpoints, you cannot pass filters as query parameters.
The interval field defines the time range. It must be an ISO 8601 duration string (e.g., P7D for 7 days) or a specific start/end time. For historical data, it is often safer to use explicit startTime and endTime to avoid ambiguity across timezones.
def build_query_body(start_time, end_time, queue_ids=None, view="default"):
"""
Constructs the JSON body for the analytics query.
Args:
start_time (str): ISO 8601 start time (e.g., '2023-10-01T00:00:00Z')
end_time (str): ISO 8601 end time (e.g., '2023-10-02T00:00:00Z')
queue_ids (list): Optional list of queue IDs to filter.
view (str): The analytics view. 'default' is standard. 'extended' provides more fields but may have lower limits.
Returns:
dict: The query body dictionary.
"""
query_body = {
"interval": f"{start_time}/{end_time}",
"view": view,
"groupBy": ["conversationId"], # Grouping by ID ensures one row per conversation
"metrics": [
"conversationId",
"channel",
"startTime",
"endTime",
"duration",
"queueId",
"queueName",
"agentId",
"agentName",
"disposition"
]
}
# Optional: Filter by specific queues
if queue_ids:
query_body["filters"] = [
{
"type": "queueId",
"values": queue_ids
}
]
return query_body
Critical Parameter Explanation:
groupBy: Setting this to["conversationId"]is essential. Without it, the API might return aggregated data or duplicate rows for each agent involved in a transfer.view: Thedefaultview is faster and has higher throughput. Theextendedview includes more metadata (like IVR nodes) but is more expensive and may trigger rate limits more quickly. Start withdefault.
Step 2: Executing the Query and Handling Pagination
The Analytics Details Query endpoint supports pagination via pageSize and pageNumber. It does not use cursor-based pagination (like some other Genesys endpoints). Instead, it returns a pageCount in the response headers or body, allowing you to loop through pages until all data is retrieved.
Important Constraint: The maximum pageSize for this endpoint is typically 1000. Attempting to request more will result in a 400 Bad Request.
import time
from purecloud_platform_client.rest import ApiException
def fetch_conversation_details(analytics_api, query_body, max_pages=100):
"""
Fetches all conversation details using page-based pagination.
Args:
analytics_api: The initialized AnalyticsApi instance.
query_body: The dictionary containing the query parameters.
max_pages: Safety limit to prevent infinite loops.
Returns:
list: A list of conversation detail objects.
"""
all_conversations = []
page_number = 1
page_size = 1000 # Maximum allowed
print(f"Starting pagination loop with page size {page_size}...")
while page_number <= max_pages:
try:
# Execute the query
# The SDK maps query_body to the request body
response = analytics_api.post_analytics_conversations_details_query(
body=query_body,
page_size=page_size,
page_number=page_number
)
# Check if the response contains data
if response.entity and response.entity.conversations:
conversations = response.entity.conversations
all_conversations.extend(conversations)
print(f"Page {page_number}: Retrieved {len(conversations)} conversations.")
# Check if we have more pages
# The response object contains pagination metadata
if response.page_count and page_number >= response.page_count:
print("Last page reached.")
break
else:
print("No more data found.")
break
page_number += 1
# Respect rate limits: Add a small delay between requests
# Genesys Cloud has strict rate limits on analytics endpoints
time.sleep(0.5)
except ApiException as e:
if e.status == 429:
print("Rate limit hit. Waiting before retry...")
time.sleep(10) # Exponential backoff could be implemented here
continue
elif e.status == 400:
print(f"Bad Request: {e.body}")
break
else:
print(f"API Error {e.status}: {e.body}")
raise
except Exception as e:
print(f"Unexpected error: {e}")
raise
return all_conversations
Why Page-Based and Not Cursor?
Analytics data is static once generated. Unlike real-time entity lists (like active users) which change frequently, historical conversation data is immutable. Page-based pagination is efficient for bulk extraction because it allows the client to predict the total volume (pageCount) and manage memory allocation. Cursor-based pagination is reserved for endpoints where the data set changes during the iteration, requiring a snapshot handle.
Step 3: Processing and Saving Results
Retrieving 10,000+ conversation objects can consume significant memory. It is best practice to process or save the data in chunks rather than accumulating everything in a single list.
import json
import csv
def save_conversations_to_csv(conversations, filename="conversations.csv"):
"""
Saves the list of conversation objects to a CSV file.
"""
if not conversations:
print("No conversations to save.")
return
# Define headers based on the first object's keys
headers = [
"conversationId", "channel", "startTime", "endTime",
"duration", "queueName", "agentName", "disposition"
]
with open(filename, 'w', newline='', encoding='utf-8') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=headers)
writer.writeheader()
for conv in conversations:
# Extract relevant fields, handling None values
row = {
"conversationId": conv.conversation_id if hasattr(conv, 'conversation_id') else None,
"channel": conv.channel if hasattr(conv, 'channel') else None,
"startTime": conv.start_time if hasattr(conv, 'start_time') else None,
"endTime": conv.end_time if hasattr(conv, 'end_time') else None,
"duration": conv.duration if hasattr(conv, 'duration') else None,
"queueName": conv.queue_name if hasattr(conv, 'queue_name') else None,
"agentName": conv.agent_name if hasattr(conv, 'agent_name') else None,
"disposition": conv.disposition if hasattr(conv, 'disposition') else None
}
writer.writerow(row)
print(f"Saved {len(conversations)} conversations to {filename}")
Complete Working Example
This script combines authentication, query construction, pagination, and data persistence into a single runnable module.
import os
import time
import csv
from purecloud_platform_client import (
ApiClient,
Configuration,
PureCloudAuthFlow,
AnalyticsApi
)
from purecloud_platform_client.rest import ApiException
from dotenv import load_dotenv
load_dotenv()
def get_analytics_api_instance():
configuration = Configuration()
configuration.host = f"https://api.{os.getenv('GENESYS_ENV', 'us-east-1')}.mygenesys.com"
configuration.client_id = os.getenv('GENESYS_CLIENT_ID')
configuration.client_secret = os.getenv('GENESYS_CLIENT_SECRET')
configuration.oauth_config = {
'flow': PureCloudAuthFlow.CLIENT_CREDENTIALS,
'scopes': ['analytics:conversation:view']
}
client = ApiClient(configuration=configuration)
return AnalyticsApi(client), client
def build_query_body(start_time, end_time):
return {
"interval": f"{start_time}/{end_time}",
"view": "default",
"groupBy": ["conversationId"],
"metrics": [
"conversationId", "channel", "startTime", "endTime",
"duration", "queueName", "agentName", "disposition"
]
}
def extract_and_save_data():
analytics_api, client = get_analytics_api_instance()
# Define date range (Last 7 days)
import datetime
end_time = datetime.datetime.utcnow().strftime('%Y-%m-%dT%H:%M:%SZ')
start_time = (datetime.datetime.utcnow() - datetime.timedelta(days=7)).strftime('%Y-%m-%dT%H:%M:%SZ')
query_body = build_query_body(start_time, end_time)
all_conversations = []
page_number = 1
page_size = 1000
max_pages = 50 # Safety break
print(f"Querying conversations from {start_time} to {end_time}")
while page_number <= max_pages:
try:
response = analytics_api.post_analytics_conversations_details_query(
body=query_body,
page_size=page_size,
page_number=page_number
)
if response.entity and response.entity.conversations:
all_conversations.extend(response.entity.conversations)
print(f"Page {page_number}: Retrieved {len(response.entity.conversations)} records.")
# Check if we are on the last page
if response.page_count and page_number >= response.page_count:
break
else:
break
page_number += 1
time.sleep(0.5) # Rate limit protection
except ApiException as e:
if e.status == 429:
print("Rate limited. Waiting 10 seconds...")
time.sleep(10)
continue
else:
print(f"Error {e.status}: {e.body}")
break
except Exception as e:
print(f"Error: {e}")
break
# Save to CSV
if all_conversations:
headers = ["conversationId", "channel", "startTime", "endTime", "duration", "queueName", "agentName", "disposition"]
with open('conversations_export.csv', 'w', newline='', encoding='utf-8') as f:
writer = csv.DictWriter(f, fieldnames=headers)
writer.writeheader()
for conv in all_conversations:
writer.writerow({
"conversationId": conv.conversation_id,
"channel": conv.channel,
"startTime": conv.start_time,
"endTime": conv.end_time,
"duration": conv.duration,
"queueName": conv.queue_name,
"agentName": conv.agent_name,
"disposition": conv.disposition
})
print(f"Export complete. Total records: {len(all_conversations)}")
else:
print("No conversations found.")
if __name__ == "__main__":
extract_and_save_data()
Common Errors & Debugging
Error: 429 Too Many Requests
Cause: The Analytics API has strict rate limits, especially for large queries. Requesting pages too quickly (e.g., less than 100ms apart) will trigger a 429.
Fix: Implement exponential backoff or a fixed sleep interval between page requests. The example above uses time.sleep(0.5). If you are hitting limits frequently, reduce the pageSize to 500 to increase the number of requests but lower the load per request, or increase the sleep time.
Error: 400 Bad Request - “The query is invalid”
Cause: This usually stems from an incorrect interval format or an unsupported metric in the view.
Fix:
- Ensure
intervalis a valid ISO 8601 duration or start/end time pair. - Check that all metrics listed in the
metricsarray are valid for the specifiedview. For example,agentNamemay not be available in all views. Refer to the Analytics API Documentation for the list of valid metrics.
Error: 403 Forbidden
Cause: The OAuth token lacks the required scope.
Fix: Verify that the OAuth client used to generate the token has the analytics:conversation:view scope. If you are using the SDK, ensure the scopes list in the oauth_config includes this value.
Error: MemoryError
Cause: Accumulating tens of thousands of conversation objects in a Python list can exhaust RAM.
Fix: Modify the loop to write each page to the CSV file immediately instead of appending to all_conversations. Open the file in append mode ('a') and write the header only on the first page.