Bulk-Create Users from CSV Using the Genesys Cloud Platform SDK for Python

Bulk-Create Users from CSV Using the Genesys Cloud Platform SDK for Python

What You Will Build

  • A Python script that parses a CSV file containing user details and creates corresponding users in Genesys Cloud via the Platform SDK.
  • This tutorial uses the Genesys Cloud Platform API (REST) wrapped by the genesyscloud Python SDK.
  • The implementation is written in Python 3.8+ using pandas for CSV handling and genesyscloud for API interactions.

Prerequisites

  • OAuth Client Type: Service Account (Client Credentials Flow) or User-to-User (if impersonating). For bulk operations, a Service Account is recommended.
  • Required Scopes: user:write, user:read.
  • SDK Version: genesyscloud >= 1.0.0 (specifically genesyscloud.users module).
  • Runtime: Python 3.8 or higher.
  • Dependencies:
    • genesyscloud (Install via pip install genesyscloud)
    • pandas (Install via pip install pandas)
    • python-dotenv (Install via pip install python-dotenv for secure credential management)

Authentication Setup

The Genesys Cloud Python SDK handles OAuth token acquisition and refresh automatically when configured correctly. You must initialize the PlatformClient with your environment and credentials.

For production scripts, store your CLIENT_ID, CLIENT_SECRET, and ENVIRONMENT (e.g., us-east-1, eu-west-1) in a .env file. Do not hardcode these values.

import os
from dotenv import load_dotenv
from genesyscloud.platform.client import PlatformClient

# Load environment variables from .env file
load_dotenv()

def get_platform_client():
    """
    Initializes and returns a configured PlatformClient instance.
    """
    client_id = os.getenv("GENESYS_CLIENT_ID")
    client_secret = os.getenv("GENESYS_CLIENT_SECRET")
    environment = os.getenv("GENESYS_ENVIRONMENT", "us-east-1")

    if not client_id or not client_secret:
        raise ValueError("GENESYS_CLIENT_ID and GENESYS_CLIENT_SECRET must be set in environment variables.")

    # Create the client instance
    client = PlatformClient(
        client_id=client_id,
        client_secret=client_secret,
        environment=environment
    )

    # Verify connectivity by fetching a simple resource (e.g., current user info if impersonating, 
    # or just ensuring the client initializes without immediate auth errors)
    # Note: The SDK lazily fetches tokens on the first API call.
    
    return client

Implementation

Step 1: Define the CSV Structure and Validation

Before interacting with the API, you must define the expected columns in your CSV. The Genesys Cloud User object requires specific fields. A minimal valid user requires at least name, email, and phone_numbers (or external_id if using identity providers, but we will assume standard email/password users for this example).

Expected CSV Format (users.csv):

first_name,last_name,email,phone_number,division_id
John,Doe,john.doe@example.com,+15550100100,global
Jane,Smith,jane.smith@example.com,+15550100200,global

Note: division_id is optional. If omitted, the user is created in the “global” division. Ensure the division exists or use the default.

Create a helper function to validate rows before sending them to the API. This prevents partial failures and provides clear error reporting.

import pandas as pd
from typing import List, Dict, Any

def validate_csv_row(row: pd.Series) -> Dict[str, Any]:
    """
    Validates a single row from the DataFrame and returns a dictionary 
    suitable for the Genesys Cloud User creation API.
    
    Raises ValueError if required fields are missing or malformed.
    """
    required_fields = ['first_name', 'last_name', 'email']
    
    # Check for missing required fields
    for field in required_fields:
        if pd.isna(row[field]) or str(row[field]).strip() == "":
            raise ValueError(f"Missing required field: {field} in row {row.name}")

    # Construct the user data dictionary
    user_data = {
        "first_name": str(row['first_name']).strip(),
        "last_name": str(row['last_name']).strip(),
        "email": str(row['email']).strip().lower(),
        "phone_numbers": []
    }

    # Handle optional phone number
    if not pd.isna(row.get('phone_number')) and str(row['phone_number']).strip() != "":
        user_data["phone_numbers"].append({
            "phone_number": str(row['phone_number']).strip(),
            "phone_type": "work"
        })

    # Handle optional division ID
    if not pd.isna(row.get('division_id')) and str(row['division_id']).strip() != "":
        user_data["division_id"] = str(row['division_id']).strip()

    return user_data

Step 2: Core Logic - Batch Processing and API Calls

Genesys Cloud APIs are rate-limited. While the SDK does not automatically retry on 429s in all versions, it is best practice to implement exponential backoff or use the SDK’s built-in retry configuration if available. For this tutorial, we will use a simple linear delay with error handling to respect rate limits and manage failures gracefully.

We will iterate through the CSV, validate each row, and call the create_user endpoint.

import time
import logging
from genesyscloud.users import UserApi
from genesyscloud.models import User

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

def create_user_from_data(client: PlatformClient, user_data: Dict[str, Any], row_index: int) -> str | None:
    """
    Attempts to create a user in Genesys Cloud.
    
    Args:
        client: The initialized PlatformClient.
        user_data: Dictionary containing user details.
        row_index: Index of the row in the CSV for logging purposes.
        
    Returns:
        The ID of the created user, or None if creation failed.
    """
    user_api = UserApi(client)
    
    try:
        # Map the dictionary to the SDK User model
        # The SDK accepts a dictionary or a User object. 
        # Using a dict is often cleaner for dynamic data.
        created_user = user_api.post_users(body=user_data)
        
        logger.info(f"Successfully created user: {user_data['email']} (ID: {created_user.id})")
        return created_user.id
        
    except Exception as e:
        # Check for specific HTTP errors
        if hasattr(e, 'status_code'):
            status_code = e.status_code
            
            if status_code == 409:
                # Conflict: User with this email likely already exists
                logger.warning(f"User already exists or conflict: {user_data['email']} (Row {row_index}). Skipping.")
                return None
            elif status_code == 400:
                # Bad Request: Validation error in the payload
                logger.error(f"Validation error for {user_data['email']} (Row {row_index}): {e.body}")
                return None
            elif status_code == 429:
                # Too Many Requests
                logger.warning(f"Rate limited at row {row_index}. Retrying in 5 seconds...")
                time.sleep(5)
                # Retry once
                return create_user_from_data(client, user_data, row_index)
            else:
                logger.error(f"Unexpected error for {user_data['email']} (Row {row_index}): {e}")
                return None
        else:
            logger.error(f"Unknown error for {user_data['email']} (Row {row_index}): {e}")
            return None

def process_csv_bulk_create(csv_path: str, client: PlatformClient):
    """
    Reads the CSV file, validates rows, and creates users in Genesys Cloud.
    """
    try:
        df = pd.read_csv(csv_path)
    except FileNotFoundError:
        logger.error(f"CSV file not found: {csv_path}")
        return
    except Exception as e:
        logger.error(f"Error reading CSV: {e}")
        return

    logger.info(f"Loaded {len(df)} rows from {csv_path}")
    
    success_count = 0
    fail_count = 0
    
    for index, row in df.iterrows():
        try:
            # Step 1: Validate and transform data
            user_data = validate_csv_row(row)
            
            # Step 2: Create user via API
            user_id = create_user_from_data(client, user_data, index)
            
            if user_id:
                success_count += 1
            else:
                fail_count += 1
                
            # Optional: Add a small delay between requests to be polite to the API
            # This helps avoid hitting the 429 limit aggressively
            if index % 10 == 0 and index > 0:
                logger.info(f"Processed {index} rows. Pausing briefly...")
                time.sleep(2)

        except ValueError as ve:
            logger.error(f"Validation error at row {index}: {ve}")
            fail_count += 1
        except Exception as e:
            logger.error(f"Unexpected error processing row {index}: {e}")
            fail_count += 1

    logger.info(f"Batch complete. Success: {success_count}, Failed: {fail_count}")

Step 3: Processing Results and Error Handling

The previous step includes basic error handling. However, for a robust solution, you should capture failed rows and write them to a separate “errors.csv” file. This allows for reprocessing without re-uploading successful users.

Modify the process_csv_bulk_create function to accumulate errors.

def process_csv_bulk_create_with_error_log(csv_path: str, client: PlatformClient, error_log_path: str = "errors.csv"):
    """
    Enhanced version that logs failed rows to a separate CSV file.
    """
    try:
        df = pd.read_csv(csv_path)
    except Exception as e:
        logger.error(f"Error reading CSV: {e}")
        return

    errors_list = []
    success_count = 0
    fail_count = 0
    
    for index, row in df.iterrows():
        row_dict = row.to_dict()
        try:
            user_data = validate_csv_row(row)
            user_id = create_user_from_data(client, user_data, index)
            
            if user_id:
                success_count += 1
            else:
                fail_count += 1
                # Add to error log with reason
                error_reason = "User already exists or API conflict" if not user_id else "Unknown API failure"
                row_dict['error_reason'] = error_reason
                errors_list.append(row_dict)
                
        except ValueError as ve:
            fail_count += 1
            row_dict['error_reason'] = f"Validation: {str(ve)}"
            errors_list.append(row_dict)
        except Exception as e:
            fail_count += 1
            row_dict['error_reason'] = f"Unexpected: {str(e)}"
            errors_list.append(row_dict)

    # Write errors to CSV if any
    if errors_list:
        error_df = pd.DataFrame(errors_list)
        error_df.to_csv(error_log_path, index=False)
        logger.info(f"Errors logged to {error_log_path}")
    else:
        logger.info("No errors encountered.")

    logger.info(f"Final Results - Success: {success_count}, Failed: {fail_count}")

Complete Working Example

Below is the full, copy-pasteable script. Save this as bulk_create_users.py.

Prerequisites:

  1. Create a .env file in the same directory with:
    GENESYS_CLIENT_ID=your_client_id
    GENESYS_CLIENT_SECRET=your_client_secret
    GENESYS_ENVIRONMENT=us-east-1
    
  2. Create a users.csv file with the structure defined in Step 1.
  3. Install dependencies: pip install genesyscloud pandas python-dotenv
import os
import time
import logging
import pandas as pd
from typing import Dict, Any, List
from dotenv import load_dotenv
from genesyscloud.platform.client import PlatformClient
from genesyscloud.users import UserApi

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler("bulk_create_users.log"),
        logging.StreamHandler()
    ]
)
logger = logging.getLogger(__name__)

def get_platform_client() -> PlatformClient:
    """Initializes and returns a configured PlatformClient instance."""
    load_dotenv()
    
    client_id = os.getenv("GENESYS_CLIENT_ID")
    client_secret = os.getenv("GENESYS_CLIENT_SECRET")
    environment = os.getenv("GENESYS_ENVIRONMENT", "us-east-1")

    if not client_id or not client_secret:
        raise ValueError("GENESYS_CLIENT_ID and GENESYS_CLIENT_SECRET must be set in environment variables.")

    try:
        client = PlatformClient(
            client_id=client_id,
            client_secret=client_secret,
            environment=environment
        )
        return client
    except Exception as e:
        logger.error(f"Failed to initialize PlatformClient: {e}")
        raise

def validate_csv_row(row: pd.Series) -> Dict[str, Any]:
    """
    Validates a single row from the DataFrame and returns a dictionary 
    suitable for the Genesys Cloud User creation API.
    """
    required_fields = ['first_name', 'last_name', 'email']
    
    for field in required_fields:
        if pd.isna(row[field]) or str(row[field]).strip() == "":
            raise ValueError(f"Missing required field: {field}")

    user_data = {
        "first_name": str(row['first_name']).strip(),
        "last_name": str(row['last_name']).strip(),
        "email": str(row['email']).strip().lower(),
        "phone_numbers": []
    }

    if not pd.isna(row.get('phone_number')) and str(row['phone_number']).strip() != "":
        user_data["phone_numbers"].append({
            "phone_number": str(row['phone_number']).strip(),
            "phone_type": "work"
        })

    if not pd.isna(row.get('division_id')) and str(row['division_id']).strip() != "":
        user_data["division_id"] = str(row['division_id']).strip()

    return user_data

def create_user_from_data(client: PlatformClient, user_data: Dict[str, Any], row_index: int) -> str | None:
    """
    Attempts to create a user in Genesys Cloud with retry logic for 429s.
    """
    user_api = UserApi(client)
    
    try:
        # Post the user creation request
        # The SDK maps the dict to the User model automatically
        created_user = user_api.post_users(body=user_data)
        
        logger.info(f"Created user: {user_data['email']} (ID: {created_user.id})")
        return created_user.id
        
    except Exception as e:
        if hasattr(e, 'status_code'):
            status_code = e.status_code
            
            if status_code == 409:
                logger.warning(f"Conflict (409): User {user_data['email']} likely exists. Skipping.")
                return None
            elif status_code == 400:
                logger.error(f"Bad Request (400) for {user_data['email']}: {e.body}")
                return None
            elif status_code == 429:
                logger.warning(f"Rate Limited (429). Waiting 5 seconds before retry...")
                time.sleep(5)
                # Retry once
                return create_user_from_data(client, user_data, row_index)
            else:
                logger.error(f"HTTP Error {status_code} for {user_data['email']}: {e}")
                return None
        else:
            logger.error(f"Unknown error for {user_data['email']}: {e}")
            return None

def process_bulk_user_creation(csv_path: str, error_log_path: str = "errors.csv"):
    """
    Main function to read CSV and create users.
    """
    logger.info(f"Starting bulk user creation from {csv_path}")
    
    try:
        client = get_platform_client()
    except Exception:
        logger.error("Could not initialize client. Exiting.")
        return

    try:
        df = pd.read_csv(csv_path)
    except FileNotFoundError:
        logger.error(f"File not found: {csv_path}")
        return
    except Exception as e:
        logger.error(f"Error reading CSV: {e}")
        return

    logger.info(f"Loaded {len(df)} rows.")
    
    errors_list = []
    success_count = 0
    fail_count = 0
    
    for index, row in df.iterrows():
        row_dict = row.to_dict()
        try:
            user_data = validate_csv_row(row)
            user_id = create_user_from_data(client, user_data, index)
            
            if user_id:
                success_count += 1
            else:
                fail_count += 1
                error_reason = "User exists or API Conflict"
                row_dict['error_reason'] = error_reason
                errors_list.append(row_dict)
                
        except ValueError as ve:
            fail_count += 1
            row_dict['error_reason'] = f"Validation: {str(ve)}"
            errors_list.append(row_dict)
        except Exception as e:
            fail_count += 1
            row_dict['error_reason'] = f"Unexpected: {str(e)}"
            errors_list.append(row_dict)

        # Throttle requests to avoid aggressive rate limiting
        if (index + 1) % 5 == 0:
            time.sleep(1)

    if errors_list:
        error_df = pd.DataFrame(errors_list)
        error_df.to_csv(error_log_path, index=False)
        logger.info(f"Errors saved to {error_log_path}")
    else:
        logger.info("No errors occurred.")

    logger.info(f"Completed. Success: {success_count}, Failed: {fail_count}")

if __name__ == "__main__":
    # Default file paths
    INPUT_CSV = "users.csv"
    ERROR_LOG = "errors.csv"
    
    # Allow overriding via command line args if needed
    import sys
    if len(sys.argv) > 1:
        INPUT_CSV = sys.argv[1]
    if len(sys.argv) > 2:
        ERROR_LOG = sys.argv[2]
        
    process_bulk_user_creation(INPUT_CSV, ERROR_LOG)

Common Errors & Debugging

Error: 409 Conflict

  • What causes it: The email address provided in the CSV already exists in the Genesys Cloud organization. Email addresses must be unique.
  • How to fix it: Check the errors.csv output. The script skips 409 errors by default. If you need to update existing users instead of skipping, you must first query for the user by email, retrieve their ID, and use the put_users_user_id endpoint.

Error: 400 Bad Request

  • What causes it: The payload sent to the API does not meet the schema requirements. Common causes include:
    • Missing first_name or last_name.
    • Invalid email format.
    • Invalid phone number format (must include country code, e.g., +1...).
    • Invalid division_id.
  • How to fix it: Check the error_reason in the log or errors.csv. The Genesys Cloud API response body usually contains a detailed message indicating which field failed validation. Ensure your CSV data is clean.

Error: 403 Forbidden

  • What causes it: The OAuth token does not have the required scopes.
  • How to fix it: Ensure your OAuth Client in the Genesys Cloud Admin Console has the user:write scope enabled. If using a User-to-User flow, ensure the impersonating user has the necessary permissions to create users.

Error: 429 Too Many Requests

  • What causes it: You are sending requests faster than the API allows.
  • How to fix it: The provided script includes a time.sleep() throttle and a retry mechanism for 429s. If you still hit this limit, increase the sleep duration or reduce the batch size. For very large batches (1000+ users), consider implementing a more sophisticated queue with exponential backoff.

Official References