Mastering Genesys Cloud Analytics Paging: pageSize, pageNumber, and Async Queries
What You Will Build
- You will build a Python script that queries the Genesys Cloud Analytics API for conversation details, correctly handling the asynchronous nature of the request and the specific paging parameters required for data retrieval.
- This tutorial uses the Genesys Cloud Platform Client SDK for Python (
genesys-cloud-py-client) and the underlying REST API concepts. - The programming language covered is Python 3.10+.
Prerequisites
- OAuth Client: A Genesys Cloud OAuth application with the
analytics:conversation:readscope. - SDK Version:
genesys-cloud-py-client>= 160.0.0. - Language/Runtime: Python 3.10 or higher.
- External Dependencies:
genesys-cloud-py-clientrequests(for raw API comparison if needed, though SDK is preferred)
Authentication Setup
Genesys Cloud uses OAuth 2.0 for authentication. The Analytics API requires valid access tokens with specific scopes. For this tutorial, we assume you have already registered an OAuth client in the Genesys Cloud Admin console.
The SDK handles token caching and refresh logic automatically when configured correctly. You must provide the environment, client_id, and client_secret.
import os
from purecloudplatformclientv2 import PlatformClient
from purecloudplatformclientv2.rest import ApiException
# Initialize the Platform Client
# This object manages the OAuth token lifecycle
platform_client = PlatformClient()
# Configure OAuth
platform_client.set_environment("mypurecloud.com")
platform_client.set_client_id(os.environ.get("GENESYS_CLIENT_ID"))
platform_client.set_client_secret(os.environ.get("GENESYS_CLIENT_SECRET"))
platform_client.set_scopes(["analytics:conversation:read"])
# Verify authentication by fetching the user profile (optional sanity check)
try:
auth_api = platform_client.AuthApi()
user = auth_api.get_user_me()
print(f"Authenticated as: {user.name}")
except ApiException as e:
print(f"Authentication failed: {e.status} - {e.reason}")
exit(1)
Implementation
Step 1: Understanding the Analytics Query Lifecycle
The Genesys Cloud Analytics API does not return data immediately upon request. This is a critical distinction from other APIs like Users or Routes. The lifecycle consists of three phases:
- POST
/api/v2/analytics/conversations/details/query: You submit the query body. The response contains aqueryIdand a status ofcreated. - GET
/api/v2/analytics/conversations/details/queries/{queryId}: You poll this endpoint. The status changes fromcreatedtorunningtocompletedorfailed. - GET
/api/v2/analytics/conversations/details/queries/{queryId}/results: Oncecompleted, you retrieve the actual data. This is where paging parameters apply.
Many developers attempt to pass pageSize and pageNumber in the initial POST request. This is incorrect. The initial POST defines the data scope (date range, filters). The GET results call defines the delivery mechanism (paging).
Step 2: Submitting the Query
First, we define the query body. Notice that there are no paging parameters here. We only define what data we want.
from purecloudplatformclientv2.models import ConversationDetailsQuery
def submit_analytics_query(platform_client: PlatformClient, start_date: str, end_date: str) -> str:
"""
Submits an analytics query and returns the queryId.
Args:
platform_client: The initialized PlatformClient
start_date: ISO 8601 start date (e.g., '2023-10-01T00:00:00Z')
end_date: ISO 8601 end date (e.g., '2023-10-01T23:59:59Z')
Returns:
queryId: The unique identifier for the query
"""
analytics_api = platform_client.AnalyticsApi()
# Define the query body
query_body = ConversationDetailsQuery(
date_range=f"{start_date}/P1D", # Last 24 hours relative to start_date
group_by=["conversation.mediaType"],
select=["conversation.id", "conversation.mediaType", "wrapUpCode"],
where="conversation.mediaType IN ['voice']",
size=100 # This 'size' in the body is often ignored for details;
# paging is controlled by the results endpoint
)
try:
# Submit the query
response = analytics_api.post_analytics_conversations_details_query(body=query_body)
print(f"Query submitted. Query ID: {response.query_id}")
return response.query_id
except ApiException as e:
print(f"Failed to submit query: {e.status} - {e.reason}")
if e.body:
print(f"Error body: {e.body}")
raise
Step 3: Polling for Completion
Before we can discuss paging, we must wait for the query to complete. The Analytics API processes data asynchronously. Polling too frequently can trigger rate limits (429 errors). A backoff strategy is recommended.
import time
def wait_for_query_completion(platform_client: PlatformClient, query_id: str, max_retries: int = 60) -> bool:
"""
Polls the query status until it is completed or failed.
Args:
platform_client: The initialized PlatformClient
query_id: The ID returned from submission
max_retries: Maximum number of polling attempts
Returns:
True if completed, False if failed
"""
analytics_api = platform_client.AnalyticsApi()
for attempt in range(max_retries):
try:
response = analytics_api.get_analytics_conversations_details_query(query_id=query_id)
if response.status == "completed":
print("Query completed successfully.")
return True
elif response.status == "failed":
print(f"Query failed. Reason: {response.failure_code}")
return False
else:
# Status is 'created' or 'running'
print(f"Query status: {response.status}. Waiting... (Attempt {attempt + 1})")
# Exponential backoff: 1s, 2s, 4s, etc.
wait_time = min(2 ** attempt, 30)
time.sleep(wait_time)
except ApiException as e:
if e.status == 429:
print("Rate limited. Waiting 5 seconds before retrying poll.")
time.sleep(5)
else:
print(f"Polling error: {e.status} - {e.reason}")
raise
print("Max retries exceeded. Query may still be running.")
return False
Step 4: Retrieving Results with Correct Paging
This is the core of the tutorial. When calling get_analytics_conversations_details_query_results, you must use the SDK parameters that map to the query string arguments pageSize and pageNumber.
Critical Concept: Genesys Cloud Analytics paging is offset-based, not cursor-based.
pageSize: The number of records to return per page (max 1000 for most endpoints).pageNumber: The page number to retrieve (1-based index).total: The total number of records available (returned in the response).pageCount: The total number of pages available. Calculated asceil(total / pageSize).
The SDK method get_analytics_conversations_details_query_results accepts these as keyword arguments.
from purecloudplatformclientv2.models import ConversationDetailList
def fetch_all_pages(platform_client: PlatformClient, query_id: str, page_size: int = 100) -> list:
"""
Fetches all pages of results for a completed query.
Args:
platform_client: The initialized PlatformClient
query_id: The ID of the completed query
page_size: Number of records per page (default 100)
Returns:
A list of all ConversationDetail objects
"""
analytics_api = platform_client.AnalyticsApi()
all_conversations = []
page_number = 1
while True:
try:
# The SDK maps 'page_size' to 'pageSize' and 'page_number' to 'pageNumber' in the query string
response: ConversationDetailList = analytics_api.get_analytics_conversations_details_query_results(
query_id=query_id,
page_size=page_size,
page_number=page_number
)
# Check if there are any entities in this response
if not response.entities:
print(f"No more entities found on page {page_number}.")
break
# Append the current page's entities to the master list
all_conversations.extend(response.entities)
# Debugging: Print paging info
total_records = response.total
total_pages = response.page_count
print(f"Fetched page {page_number}/{total_pages}. Total records so far: {len(all_conversations)}/{total_records}")
# Check if we have fetched all records
# Note: response.page_count is the total number of pages available
if page_number >= response.page_count:
print("All pages fetched.")
break
# Move to the next page
page_number += 1
except ApiException as e:
print(f"Error fetching page {page_number}: {e.status} - {e.reason}")
if e.status == 429:
print("Rate limited during fetch. Waiting 5 seconds...")
time.sleep(5)
continue
raise
return all_conversations
Complete Working Example
Below is the full, copy-pasteable script. It combines authentication, query submission, polling, and paging logic.
import os
import time
from purecloudplatformclientv2 import PlatformClient
from purecloudplatformclientv2.rest import ApiException
from purecloudplatformclientv2.models import ConversationDetailsQuery, ConversationDetailList
def main():
# 1. Authentication
platform_client = PlatformClient()
platform_client.set_environment("mypurecloud.com")
platform_client.set_client_id(os.environ.get("GENESYS_CLIENT_ID"))
platform_client.set_client_secret(os.environ.get("GENESYS_CLIENT_SECRET"))
platform_client.set_scopes(["analytics:conversation:read"])
try:
# Sanity check auth
auth_api = platform_client.AuthApi()
user = auth_api.get_user_me()
print(f"Authenticated as: {user.name}")
except ApiException as e:
print(f"Authentication failed: {e.status} - {e.reason}")
return
# 2. Define Query Parameters
# Using a fixed date range for predictability.
# In production, you might calculate this dynamically.
start_date = "2023-10-01T00:00:00Z"
end_date = "2023-10-01T23:59:59Z"
print(f"Submitting query for date range: {start_date} to {end_date}")
# 3. Submit Query
analytics_api = platform_client.AnalyticsApi()
try:
query_body = ConversationDetailsQuery(
date_range=f"{start_date}/P1D",
group_by=["conversation.mediaType"],
select=["conversation.id", "conversation.mediaType", "wrapUpCode"],
where="conversation.mediaType IN ['voice']",
size=100 # Note: This size in body is for grouping limits, not paging
)
submit_response = analytics_api.post_analytics_conversations_details_query(body=query_body)
query_id = submit_response.query_id
print(f"Query submitted. Query ID: {query_id}")
except ApiException as e:
print(f"Failed to submit query: {e.status} - {e.reason}")
return
# 4. Poll for Completion
print("Waiting for query to complete...")
max_retries = 60
completed = False
for attempt in range(max_retries):
try:
status_response = analytics_api.get_analytics_conversations_details_query(query_id=query_id)
if status_response.status == "completed":
completed = True
print("Query completed.")
break
elif status_response.status == "failed":
print(f"Query failed: {status_response.failure_code}")
return
else:
print(f"Status: {status_response.status}. Retrying in {min(2**attempt, 30)}s...")
time.sleep(min(2**attempt, 30))
except ApiException as e:
if e.status == 429:
print("Rate limited while polling. Waiting 5s...")
time.sleep(5)
else:
print(f"Polling error: {e.status} - {e.reason}")
return
if not completed:
print("Query did not complete within timeout.")
return
# 5. Fetch Results with Paging
print("Fetching results with paging...")
all_conversations = []
page_number = 1
page_size = 100 # Genesys max is 1000, but 100 is safer for debugging
while True:
try:
# CRITICAL: pageSize and pageNumber are passed as kwargs to the SDK method
# They map to the query string parameters in the REST call
result: ConversationDetailList = analytics_api.get_analytics_conversations_details_query_results(
query_id=query_id,
page_size=page_size,
page_number=page_number
)
if not result.entities:
print(f"No entities on page {page_number}. Done.")
break
all_conversations.extend(result.entities)
# Paging Logic Explanation:
# result.total: Total number of records matching the query
# result.page_count: Total number of pages available given the pageSize
# result.page_size: The actual page size returned (may be less than requested on last page)
print(f"Page {page_number}: Fetched {len(result.entities)} records. "
f"Total available: {result.total}. "
f"Total pages: {result.page_count}.")
# Stop if we have reached the last page
if page_number >= result.page_count:
print("All pages retrieved.")
break
page_number += 1
except ApiException as e:
print(f"Error fetching page {page_number}: {e.status} - {e.reason}")
if e.status == 429:
print("Rate limited. Waiting 5s...")
time.sleep(5)
continue
raise
# 6. Process Results
print(f"\nTotal conversations fetched: {len(all_conversations)}")
if all_conversations:
print("First conversation ID:", all_conversations[0].conversation.id)
print("First conversation Media Type:", all_conversations[0].conversation.media_type)
if __name__ == "__main__":
main()
Common Errors & Debugging
Error: 400 Bad Request - “pageSize must be between 1 and 1000”
Cause: You passed a pageSize larger than 1000 or less than 1. Genesys Cloud Analytics enforces a hard limit on page sizes to prevent memory exhaustion on the server side.
Fix: Ensure your page_size variable is set to 1000 or lower.
# Correct
page_size = 1000
# Incorrect
page_size = 2000 # Will return 400
Error: 400 Bad Request - “pageNumber must be greater than 0”
Cause: You passed a pageNumber of 0 or negative. Genesys Cloud uses 1-based indexing for pagination.
Fix: Initialize your page counter at 1.
# Correct
page_number = 1
# Incorrect
page_number = 0 # Will return 400
Error: 429 Too Many Requests
Cause: You are polling the query status or fetching results too frequently. The Analytics API has strict rate limits, especially for large queries.
Fix: Implement exponential backoff. Do not poll more than once every 2-5 seconds initially. Increase the wait time if you receive a 429.
# Implement backoff
wait_time = min(2 ** attempt, 30)
time.sleep(wait_time)
Error: Empty Entities List but Page Count > 0
Cause: This is rare but can happen if the query returns grouped data with no individual details, or if there is a mismatch between the select fields and the actual data availability.
Fix: Check the response.total. If total is 0, the query returned no data. If total is > 0 but entities is empty, verify your where clause and select fields. Ensure you are querying the correct date range.
Error: “Query Not Found”
Cause: You are trying to fetch results before the query ID is generated, or the query has expired. Analytics queries expire after a certain period (usually 24-48 hours depending on the data volume).
Fix: Ensure you are using the query_id returned from the POST request. If the query is old, re-submit it.