Mastering Analytics API Pagination: Handling pageSize, pageNumber, and Cursor-Based Results
What You Will Build
- A Python script that retrieves historical conversation analytics data from Genesys Cloud CX using the
PureCloudPlatformClientV2SDK. - Implementation of a robust pagination loop that respects
pageSize, handlespageCount, and manages cursor-based navigation for large datasets. - A working example that aggregates total call volume across multiple pages without hitting rate limits or data truncation errors.
Prerequisites
- OAuth Client Type: Service Account or User-to-User (JWT).
- Required Scopes:
analytics:query:read(for querying historical data) oranalytics:realtime:read(for real-time data, though this tutorial focuses on historical). - SDK Version:
genesys-cloud-pyversion 7.0.0 or later. - Language/Runtime: Python 3.8+.
- External Dependencies:
genesys-cloud-py(official SDK)python-dotenv(for secure credential management)
Authentication Setup
Genesys Cloud CX uses OAuth 2.0 for all API access. For server-to-server integrations, the recommended flow is the Client Credentials Grant. This flow requires a Service Account with the appropriate permissions.
First, install the required packages:
pip install genesys-cloud-py python-dotenv
Create a .env file in your project root with your credentials:
GENESYS_CLIENT_ID=your_client_id
GENESYS_CLIENT_SECRET=your_client_secret
GENESYS_REGION=us-east-1
The following code demonstrates how to initialize the SDK client with automatic token refresh logic. The SDK handles the underlying OAuth token exchange and caching, but you must provide the initial configuration.
import os
import time
from dotenv import load_dotenv
from purecloudplatformclientv2 import (
ApiClient,
Configuration,
AnalyticsApi
)
# Load environment variables
load_dotenv()
def get_authenticated_client() -> ApiClient:
"""
Initializes and returns an authenticated Genesys Cloud API client.
Handles OAuth2 client credentials flow automatically via the SDK.
"""
client_id = os.getenv("GENESYS_CLIENT_ID")
client_secret = os.getenv("GENESYS_CLIENT_SECRET")
region = os.getenv("GENESYS_REGION", "us-east-1")
if not client_id or not client_secret:
raise ValueError("GENESYS_CLIENT_ID and GENESYS_CLIENT_SECRET must be set in environment.")
# Construct the base URL for the specific region
# Example: https://api.us-east-1.mygen.com
base_url = f"https://api.{region}.mygen.com"
configuration = Configuration(
host=base_url,
client_id=client_id,
client_secret=client_secret
)
client = ApiClient(configuration)
# The SDK lazily initializes the OAuth token.
# We can force initialization to fail fast if credentials are invalid.
try:
# Trigger token fetch by accessing the auth property
_ = client.auth.get_access_token()
except Exception as e:
raise ConnectionError(f"Failed to authenticate with Genesys Cloud: {e}")
return client
client = get_authenticated_client()
analytics_api = AnalyticsApi(client)
Implementation
Step 1: Constructing the Analytics Query
The Genesys Cloud Analytics API does not return raw data in a simple list. It uses a complex request body to define what data you want. The response is wrapped in a pagination object that contains pageSize, pageCount, and the actual data.
To query conversation details, we use the post_analytics_conversations_details_query endpoint. This endpoint is powerful but requires a specific payload structure.
from purecloudplatformclientv2 import ConversationDetailsQueryRequest
def build_query_request(start_date: str, end_date: str) -> ConversationDetailsQueryRequest:
"""
Builds the request body for the analytics query.
Args:
start_date: ISO 8601 start date (e.g., '2023-10-01T00:00:00Z')
end_date: ISO 8601 end date (e.g., '2023-10-02T00:00:00Z')
Returns:
ConversationDetailsQueryRequest object
"""
# Define the date range
date_range = {
"startDate": start_date,
"endDate": end_date
}
# Define the metrics you want. Here, we just want the count of conversations.
# For detailed data, you might want 'talk', 'hold', 'work', etc.
metrics = ["conversations"]
# Define the groupings. We will group by 'channel' to see voice vs chat vs email.
group_by = ["channel"]
# The query object
query = ConversationDetailsQueryRequest(
date_range=date_range,
metrics=metrics,
group_by=group_by
)
return query
Critical Note on pageSize:
The Analytics API has a hard limit on the number of records returned per page. For ConversationDetailsQuery, the maximum pageSize is typically 1,000. If you request more, the API may silently cap it or return an error. If you request fewer, you will increase the number of API calls required, which consumes your rate limit budget. Always use the largest possible pageSize that fits your memory constraints.
Step 2: Executing the First Page and Inspecting Pagination Metadata
When you make the first API call, you must pass the pageSize parameter. The response will contain a pageCount field. This field tells you how many total pages exist for your query given the specified pageSize.
If pageCount is 1, you have all your data. If pageCount is greater than 1, you must iterate.
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def fetch_first_page(analytics_api, query_request, page_size=1000):
"""
Fetches the first page of analytics data.
Args:
analytics_api: The initialized AnalyticsApi client.
query_request: The ConversationDetailsQueryRequest object.
page_size: Number of records per page (max 1000 for details).
Returns:
The first page response object.
"""
try:
# The SDK method maps to POST /api/v2/analytics/conversations/details/query
response = analytics_api.post_analytics_conversations_details_query(
body=query_request,
page_size=page_size,
page_number=1 # Always start with page 1
)
logger.info(f"First page retrieved. Total pages: {response.page_count}")
logger.info(f"Records in this page: {len(response.entities) if response.entities else 0}")
return response
except Exception as e:
logger.error(f"Error fetching first page: {e}")
raise
Understanding the Response Object:
The response object returned by the SDK is a ConversationDetailsQueryResponse. Key attributes include:
entities: A list of data records (the actual analytics data).page_count: The total number of pages available.page_size: The size of the current page (may differ from requested if the last page is partial).total: The total number of records across all pages.
Step 3: Implementing the Pagination Loop
This is where most developers encounter errors. You cannot simply increment page_number indefinitely if the API uses cursor-based pagination for certain endpoints, but for the standard ConversationDetailsQuery, it supports offset-based pagination via page_number.
However, you must handle the following edge cases:
- Rate Limiting (429): If you fetch pages too quickly, Genesys will block you. You must implement exponential backoff.
- Empty Pages: If
response.entitiesis empty, stop iterating even ifpage_countsuggests more. - Max Pages: Ensure you do not exceed
page_count.
Here is the robust pagination logic:
import time
def fetch_all_pages(analytics_api, query_request, page_size=1000, max_retries=5):
"""
Iterates through all pages of analytics data.
Args:
analytics_api: The initialized AnalyticsApi client.
query_request: The ConversationDetailsQueryRequest object.
page_size: Number of records per page.
max_retries: Maximum retries for rate limiting.
Returns:
A list of all entities from all pages.
"""
all_entities = []
current_page = 1
total_pages = None
while True:
try:
logger.info(f"Fetching page {current_page}...")
response = analytics_api.post_analytics_conversations_details_query(
body=query_request,
page_size=page_size,
page_number=current_page
)
# Initialize total_pages from the first response
if total_pages is None:
total_pages = response.page_count
logger.info(f"Total pages to fetch: {total_pages}")
# Append data from this page
if response.entities:
all_entities.extend(response.entities)
logger.info(f"Collected {len(all_entities)} records so far.")
else:
logger.warning(f"Page {current_page} returned no entities. Stopping.")
break
# Check if we have fetched all pages
if current_page >= total_pages:
logger.info("All pages fetched successfully.")
break
# Move to the next page
current_page += 1
# Small delay to be polite to the API and avoid burst rate limits
# Genesys has a rate limit of roughly 100 requests per minute per client.
# If fetching many pages, this delay is crucial.
time.sleep(0.5)
except Exception as e:
# Handle Rate Limiting (429)
if "429" in str(e) or "Too Many Requests" in str(e):
if max_retries > 0:
wait_time = 2 ** (max_retries - 1) # Exponential backoff
logger.warning(f"Rate limited (429). Waiting {wait_time} seconds before retrying...")
time.sleep(wait_time)
max_retries -= 1
continue # Retry the same page
else:
logger.error("Max retries exceeded for rate limiting.")
raise Exception("Rate limit exceeded. Try reducing page size or increasing delay.")
else:
# Handle other errors (5xx, 4xx)
logger.error(f"Unexpected error on page {current_page}: {e}")
raise
return all_entities
Complete Working Example
The following script combines authentication, query building, and pagination into a single runnable module. It calculates the total conversation count across all channels for a given date range.
import os
import sys
import logging
from datetime import datetime, timedelta
from dotenv import load_dotenv
from purecloudplatformclientv2 import (
ApiClient,
Configuration,
AnalyticsApi,
ConversationDetailsQueryRequest
)
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
def load_credentials():
load_dotenv()
return {
"client_id": os.getenv("GENESYS_CLIENT_ID"),
"client_secret": os.getenv("GENESYS_CLIENT_SECRET"),
"region": os.getenv("GENESYS_REGION", "us-east-1")
}
def create_client(credentials):
config = Configuration(
host=f"https://api.{credentials['region']}.mygen.com",
client_id=credentials['client_id'],
client_secret=credentials['client_secret']
)
client = ApiClient(config)
# Force token initialization
try:
client.auth.get_access_token()
except Exception as e:
logger.error(f"Authentication failed: {e}")
sys.exit(1)
return client
def get_analytics_data(start_date_iso, end_date_iso):
credentials = load_credentials()
if not credentials['client_id']:
logger.error("Missing GENESYS_CLIENT_ID")
sys.exit(1)
client = create_client(credentials)
analytics_api = AnalyticsApi(client)
# Define the query
date_range = {
"startDate": start_date_iso,
"endDate": end_date_iso
}
# We want to see data grouped by channel
query = ConversationDetailsQueryRequest(
date_range=date_range,
metrics=["conversations"],
group_by=["channel"]
)
page_size = 1000 # Max allowed for this endpoint
try:
logger.info("Starting pagination fetch...")
all_data = []
current_page = 1
total_pages = None
while True:
response = analytics_api.post_analytics_conversations_details_query(
body=query,
page_size=page_size,
page_number=current_page
)
if total_pages is None:
total_pages = response.page_count
logger.info(f"Pagination metadata: Total Pages={total_pages}, Total Records={response.total}")
if response.entities:
all_data.extend(response.entities)
logger.info(f"Processed Page {current_page}/{total_pages}")
if current_page >= total_pages:
break
current_page += 1
time.sleep(0.5) # Rate limit protection
return all_data
except Exception as e:
logger.error(f"Failed to fetch analytics data: {e}")
raise
def main():
# Set date range: Last 7 days
end_date = datetime.utcnow()
start_date = end_date - timedelta(days=7)
start_iso = start_date.strftime("%Y-%m-%dT%H:%M:%SZ")
end_iso = end_date.strftime("%Y-%m-%dT%H:%M:%SZ")
logger.info(f"Fetching data from {start_iso} to {end_iso}")
try:
data = get_analytics_data(start_iso, end_iso)
# Process the aggregated data
if not data:
logger.info("No data found for the specified period.")
return
# Aggregate conversations by channel
channel_counts = {}
for entity in data:
# entity is a ConversationDetailsQueryEntity
# It contains 'channel' and 'metrics'
channel = entity.channel
# The metrics are a dictionary-like object
conv_count = entity.metrics.get("conversations", 0)
if channel not in channel_counts:
channel_counts[channel] = 0
channel_counts[channel] += conv_count
logger.info("=== Final Aggregated Results ===")
for channel, count in channel_counts.items():
logger.info(f"Channel: {channel}, Total Conversations: {count}")
except Exception as e:
logger.error(f"Application error: {e}")
if __name__ == "__main__":
main()
Common Errors & Debugging
Error: 429 Too Many Requests
What causes it:
The Genesys Cloud API enforces strict rate limits. For Analytics queries, the limit is often around 100 requests per minute for the entire application (client ID). If your pagination loop runs too fast (e.g., fetching 100 pages in 1 second), you will hit this limit.
How to fix it:
- Implement
time.sleep()between API calls. A delay of 0.5 to 1.0 seconds is usually sufficient for pagination. - Increase
pageSizeto the maximum allowed (1,000) to reduce the total number of API calls. - If you are still hitting limits, implement exponential backoff in your exception handler, as shown in Step 3.
Code Fix:
# Inside the pagination loop
if current_page >= total_pages:
break
current_page += 1
time.sleep(1.0) # Explicit delay
Error: 400 Bad Request - Invalid Page Number
What causes it:
You requested a page_number that exceeds the page_count returned by the API. This can happen if the data changes during the query (e.g., new conversations are added) or if you manually hardcoded a page number without checking page_count.
How to fix it:
Always read response.page_count from the first response and use it as your loop boundary. Do not assume the number of pages.
Code Fix:
# Ensure you check the boundary
if current_page > response.page_count:
logger.warning(f"Page {current_page} exceeds total pages {response.page_count}. Stopping.")
break
Error: 403 Forbidden - Insufficient Scopes
What causes it:
The OAuth token used for the request does not have the analytics:query:read scope. This is common when using a user-to-user flow where the user was not granted the “Analytics” permissions in the Genesys Cloud Admin console.
How to fix it:
- Verify the Service Account or User has the “Analytics” permission set.
- Check the scopes requested during OAuth token generation.
- Regenerate the token with the correct scopes.
Error: 504 Gateway Timeout
What causes it:
Analytics queries are computationally expensive. If your date range is too large (e.g., 1 year) or your groupings are too complex, the backend may take longer than the API gateway timeout (usually 30-60 seconds) to aggregate the data.
How to fix it:
- Reduce the date range. Query in smaller chunks (e.g., 1 week at a time).
- Reduce the complexity of
group_by. Grouping by multiple attributes (e.g.,channel,skill,queue) creates a larger result set and takes longer to compute. - Use the
asyncquery pattern if available for your specific endpoint (thoughConversationDetailsQueryis synchronous, other analytics endpoints may support async job submission).