How to Handle File Attachments in Genesys Cloud Web Messaging

How to Handle File Attachments in Genesys Cloud Web Messaging

What You Will Build

  • You will build a backend service that receives file attachment metadata from the Genesys Cloud Web Messaging channel and validates the file against strict MIME type and size constraints before processing.
  • This tutorial uses the Genesys Cloud CX REST API (/api/v2/conversations/messaging) and the Web Messaging Widget SDK configuration options.
  • The implementation covers Python (Flask) for the backend receiver and JavaScript for the frontend widget configuration and validation logic.

Prerequisites

  • OAuth Client Type: A Public or Confidential Client with the following scopes:
    • conversation:messaging:view (to inspect message metadata)
    • file:attachment:read (if using the Files API to download the actual blob)
    • user:view (for debugging user context)
  • SDK Version: Genesys Cloud Python SDK v2.100+ or direct REST API calls.
  • Language/Runtime: Python 3.9+ with flask, requests, and python-magic.
  • External Dependencies:
    • python-magic: For robust MIME type detection (libmagic wrapper).
    • Flask: For the HTTP server receiving webhook events.

Authentication Setup

Genesys Cloud uses OAuth 2.0. For a backend service processing web messages, you typically use the Client Credentials flow if the service acts on behalf of the platform, or Authorization Code flow if it acts on behalf of a specific user. For file validation logic, Client Credentials is usually sufficient as you are inspecting platform events.

import requests
import os
import time

class GenesysAuth:
    def __init__(self, client_id: str, client_secret: str, org_id: str):
        self.client_id = client_id
        self.client_secret = client_secret
        self.org_id = org_id
        self.access_token = None
        self.token_expiry = 0
        self.base_url = f"https://{org_id}.mypurecloud.com/api/v2"
        self.oauth_url = f"https://login.mypurecloud.com/oauth/token"

    def get_token(self) -> str:
        if time.time() < self.token_expiry:
            return self.access_token

        payload = {
            "grant_type": "client_credentials",
            "client_id": self.client_id,
            "client_secret": self.client_secret,
            "scope": "conversation:messaging:view file:attachment:read"
        }

        try:
            response = requests.post(self.oauth_url, data=payload)
            response.raise_for_status()
            data = response.json()
            self.access_token = data["access_token"]
            self.token_expiry = time.time() + data["expires_in"] - 60 # Buffer
            return self.access_token
        except requests.exceptions.RequestException as e:
            raise RuntimeError(f"Failed to obtain OAuth token: {e}")

    def get_headers(self) -> dict:
        return {
            "Authorization": f"Bearer {self.get_token()}",
            "Content-Type": "application/json"
        }

Implementation

Step 1: Configure Web Messaging Widget for File Limits

Before handling the backend, you must understand how files enter the system. The Genesys Cloud Web Messaging Widget allows you to restrict file uploads directly in the widget configuration. This prevents invalid files from ever reaching the backend.

Note: The widget enforces these limits client-side. Never trust client-side validation alone.

// frontend/widget-config.js

// Initialize the Genesys Cloud Web Messaging Widget
window.purecloudWebMessaging = {
    settings: {
        orgId: "your-org-id",
        deploymentId: "your-deployment-id",
        deploymentVersionId: "your-version-id",
        environment: "mypurecloud.com",
        
        // File Upload Configuration
        fileUpload: {
            enabled: true,
            
            // Maximum file size in bytes (e.g., 10MB)
            maxSize: 10 * 1024 * 1024, 
            
            // Allowed MIME types. 
            // Be specific. Avoid wildcards like 'image/*' if possible for security.
            allowedMimeTypes: [
                "image/png",
                "image/jpeg",
                "application/pdf"
            ],
            
            // Maximum number of files per message
            maxFilesPerMessage: 3
        }
    }
};

// Load the widget script
const script = document.createElement('script');
script.src = 'https://webchat.mypurecloud.com/widgets/webchat/v1/embed.js';
script.async = true;
document.head.appendChild(script);

Step 2: Receive and Validate Webhook Events

Genesys Cloud sends real-time events via Webhooks when a message is received. You must subscribe to the Conversation:Message event. The event payload contains metadata about the attachment, including the attachmentId, fileName, and contentType.

Important: The webhook payload does not contain the file content. It contains a reference to the file stored in Genesys Cloud’s secure storage.

# backend/app.py

from flask import Flask, request, jsonify
import logging
import requests
from genesys_auth import GenesysAuth

app = Flask(__name__)
logger = logging.getLogger(__name__)

# Initialize Auth (replace with env vars in production)
auth_client = GenesysAuth(
    client_id=os.getenv("GENESYS_CLIENT_ID"),
    client_secret=os.getenv("GENESYS_CLIENT_SECRET"),
    org_id=os.getenv("GENESYS_ORG_ID")
)

# Allowed MIME types for your business logic
ALLOWED_MIME_TYPES = {
    "image/png",
    "image/jpeg",
    "application/pdf"
}

MAX_FILE_SIZE_BYTES = 10 * 1024 * 1024 # 10MB

@app.route('/webhook/genesys', methods=['POST'])
def handle_genesis_webhook():
    # 1. Verify Webhook Signature (Critical for Security)
    # Genesys Cloud signs requests. Verify the signature header.
    if not verify_signature(request):
        return jsonify({"error": "Invalid signature"}), 401

    event_data = request.json
    event_type = event_data.get("eventType")

    if event_type != "Conversation:Message":
        return jsonify({"status": "ignored"}), 200

    # 2. Extract Message Details
    conversation_id = event_data["data"]["conversationId"]
    message_id = event_data["data"]["messageId"]
    
    # Check if the message has attachments
    attachments = event_data["data"].get("attachments", [])
    
    if not attachments:
        return jsonify({"status": "no attachments"}), 200

    # 3. Process Each Attachment
    for attachment in attachments:
        try:
            validate_attachment(attachment, conversation_id)
        except ValidationError as e:
            logger.error(f"Validation failed for attachment {attachment['id']}: {e}")
            # Optionally, send a reply to the user indicating the error
            send_error_reply(conversation_id, str(e))
            return jsonify({"error": str(e)}), 400

    return jsonify({"status": "accepted"}), 200

def verify_signature(req):
    # Implement HMAC verification here using your webhook secret
    # This is omitted for brevity but is mandatory in production
    return True

class ValidationError(Exception):
    pass

def send_error_reply(conversation_id: str, message: str):
    """
    Sends a system message back to the user explaining the rejection.
    Uses the Messaging API to send a message.
    """
    url = f"{auth_client.base_url}/conversations/messaging/{conversation_id}/messages"
    
    payload = {
        "type": "text",
        "text": f"File rejected: {message}"
    }
    
    try:
        res = requests.post(url, json=payload, headers=auth_client.get_headers())
        res.raise_for_status()
    except Exception as e:
        logger.error(f"Failed to send error reply: {e}")

Step 3: Deep Validation via Files API

The webhook payload provides basic metadata (contentType, size). However, users can spoof MIME types. To ensure security, you must download the file using the Files API and verify the actual binary content using a library like python-magic.

The endpoint to download a file is:
GET /api/v2/files/attachments/{attachmentId}

OAuth Scope Required: file:attachment:read

import io
import magic # python-magic

def validate_attachment(attachment_meta: dict, conversation_id: str):
    """
    Downloads the file from Genesys Cloud and validates it against business rules.
    """
    attachment_id = attachment_meta["id"]
    claimed_mime = attachment_meta.get("contentType", "application/octet-stream")
    claimed_size = attachment_meta.get("size", 0)
    file_name = attachment_meta.get("fileName", "unknown")

    # 1. Check Size from Metadata (Fast Fail)
    if claimed_size > MAX_FILE_SIZE_BYTES:
        raise ValidationError(f"File '{file_name}' exceeds maximum size of {MAX_FILE_SIZE_BYTES} bytes.")

    # 2. Check Claimed MIME Type (Fast Fail)
    if claimed_mime not in ALLOWED_MIME_TYPES:
        raise ValidationError(f"File type '{claimed_mime}' is not allowed. Allowed: {ALLOWED_MIME_TYPES}")

    # 3. Download File Content for Deep Inspection
    download_url = f"{auth_client.base_url}/files/attachments/{attachment_id}"
    
    try:
        response = requests.get(download_url, headers=auth_client.get_headers(), stream=True)
        response.raise_for_status()
        
        # Read file content into memory (or stream to disk if large)
        # For security scanning, we need the bytes.
        file_content = response.content
        
        # 4. Verify Actual MIME Type using magic bytes
        detected_mime = magic.from_buffer(file_content, mime=True)
        
        if detected_mime not in ALLOWED_MIME_TYPES:
            raise ValidationError(
                f"Spoofed MIME type detected. Claimed: {claimed_mime}, Actual: {detected_mime}"
            )
        
        # 5. Verify Actual Size
        if len(file_content) > MAX_FILE_SIZE_BYTES:
            raise ValidationError("Actual file size exceeds limit.")

        # 6. Process Valid File (e.g., save to S3, run OCR, etc.)
        process_valid_file(file_content, file_name, detected_mime)

    except requests.exceptions.HTTPError as e:
        if e.response.status_code == 404:
            raise ValidationError("Attachment not found in Genesys Cloud storage.")
        raise

def process_valid_file(content: bytes, filename: str, mime_type: str):
    """
    Placeholder for your business logic.
    Example: Upload to AWS S3, save to database, or pass to an AI model.
    """
    logger.info(f"Validated file: {filename} ({mime_type}, {len(content)} bytes)")
    # s3_client.put_object(Bucket='my-bucket', Key=filename, Body=content)

Step 4: Handling Rate Limits and Retries

The Files API can be rate-limited, especially if you process many attachments concurrently. Implement exponential backoff for 429 Too Many Requests responses.

import time
import random

def get_file_with_retry(attachment_id: str, max_retries=3):
    """
    Fetches the file attachment with exponential backoff for 429 errors.
    """
    url = f"{auth_client.base_url}/files/attachments/{attachment_id}"
    headers = auth_client.get_headers()
    
    for attempt in range(max_retries):
        try:
            response = requests.get(url, headers=headers, stream=True)
            
            if response.status_code == 200:
                return response.content
            
            if response.status_code == 429:
                # Extract Retry-After header if present, else use default backoff
                retry_after = int(response.headers.get('Retry-After', 2 ** attempt + random.uniform(0, 1)))
                logger.warning(f"Rate limited (429). Retrying in {retry_after}s...")
                time.sleep(retry_after)
                continue
            
            # Handle other errors
            response.raise_for_status()
            
        except requests.exceptions.RequestException as e:
            logger.error(f"Request failed on attempt {attempt + 1}: {e}")
            if attempt == max_retries - 1:
                raise
    
    raise RuntimeError("Max retries exceeded for file download.")

Complete Working Example

Below is the consolidated Python script. It requires flask, requests, and python-magic.

import os
import time
import requests
import magic
from flask import Flask, request, jsonify

# --- Configuration ---
GENESYS_CLIENT_ID = os.getenv("GENESYS_CLIENT_ID")
GENESYS_CLIENT_SECRET = os.getenv("GENESYS_CLIENT_SECRET")
GENESYS_ORG_ID = os.getenv("GENESYS_ORG_ID")
WEBHOOK_SECRET = os.getenv("WEBHOOK_SECRET")

ALLOWED_MIME_TYPES = {"image/png", "image/jpeg", "application/pdf"}
MAX_FILE_SIZE_BYTES = 10 * 1024 * 1024 # 10MB

# --- OAuth Helper ---
class GenesysAuth:
    def __init__(self, client_id, client_secret, org_id):
        self.client_id = client_id
        self.client_secret = client_secret
        self.org_id = org_id
        self.access_token = None
        self.token_expiry = 0
        self.base_url = f"https://{org_id}.mypurecloud.com/api/v2"
        self.oauth_url = f"https://login.mypurecloud.com/oauth/token"

    def get_token(self) -> str:
        if time.time() < self.token_expiry:
            return self.access_token

        payload = {
            "grant_type": "client_credentials",
            "client_id": self.client_id,
            "client_secret": self.client_secret,
            "scope": "conversation:messaging:view file:attachment:read"
        }

        response = requests.post(self.oauth_url, data=payload)
        response.raise_for_status()
        data = response.json()
        self.access_token = data["access_token"]
        self.token_expiry = time.time() + data["expires_in"] - 60
        return self.access_token

    def get_headers(self) -> dict:
        return {
            "Authorization": f"Bearer {self.get_token()}",
            "Content-Type": "application/json"
        }

auth = GenesysAuth(GENESYS_CLIENT_ID, GENESYS_CLIENT_SECRET, GENESYS_ORG_ID)

# --- Flask App ---
app = Flask(__name__)

@app.route('/webhook/genesys', methods=['POST'])
def handle_webhook():
    # 1. Validate Signature (Simplified for example)
    # In production, verify HMAC-SHA256 using WEBHOOK_SECRET
    
    data = request.json
    event_type = data.get("eventType")
    
    if event_type != "Conversation:Message":
        return jsonify({"status": "ok"}), 200

    conversation_id = data["data"]["conversationId"]
    attachments = data["data"].get("attachments", [])

    if not attachments:
        return jsonify({"status": "no_attachments"}), 200

    for att in attachments:
        try:
            validate_and_process(att, conversation_id)
        except Exception as e:
            # Log error and optionally notify user
            print(f"Validation Error: {e}")
            # send_error_reply(conversation_id, str(e))
            return jsonify({"error": str(e)}), 400

    return jsonify({"status": "processed"}), 200

def validate_and_process(att_meta: dict, conversation_id: str):
    file_id = att_meta["id"]
    file_name = att_meta.get("fileName", "unknown")
    claimed_mime = att_meta.get("contentType", "application/octet-stream")
    claimed_size = att_meta.get("size", 0)

    # 1. Metadata Checks
    if claimed_size > MAX_FILE_SIZE_BYTES:
        raise ValueError(f"File {file_name} too large.")
    
    if claimed_mime not in ALLOWED_MIME_TYPES:
        raise ValueError(f"MIME type {claimed_mime} not allowed.")

    # 2. Download File
    url = f"{auth.base_url}/files/attachments/{file_id}"
    resp = requests.get(url, headers=auth.get_headers())
    
    if resp.status_code != 200:
        raise ValueError(f"Failed to download file {file_id}: {resp.status_code}")

    content = resp.content

    # 3. Deep MIME Check
    actual_mime = magic.from_buffer(content, mime=True)
    
    if actual_mime not in ALLOWED_MIME_TYPES:
        raise ValueError(f"Spoofed MIME: Expected {claimed_mime}, got {actual_mime}")

    # 4. Success
    print(f"Successfully validated {file_name} as {actual_mime}")

if __name__ == "__main__":
    app.run(port=5000)

Common Errors & Debugging

Error: 401 Unauthorized

  • Cause: The OAuth token is expired, invalid, or missing the file:attachment:read scope.
  • Fix: Ensure your GenesysAuth class refreshes the token before every API call. Check that the Client ID/Secret are correct. Verify the scope string includes file:attachment:read.

Error: 403 Forbidden

  • Cause: The OAuth client does not have permission to read attachments for this conversation, or the file has been deleted/archived.
  • Fix: Check the OAuth client’s permissions in the Genesys Cloud Admin console. Ensure the client is assigned to the correct organization.

Error: 404 Not Found

  • Cause: The attachmentId is invalid or the file was deleted by the user or system retention policies before your backend processed it.
  • Fix: Implement a retry mechanism. If the file is genuinely gone, log the error and notify the user that their attachment could not be processed.

Error: python-magic not found

  • Cause: The python-magic library depends on the system library libmagic.
  • Fix: Install libmagic via your OS package manager:
    • Ubuntu/Debian: sudo apt-get install libmagic1
    • macOS: brew install libmagic
    • Then install the Python package: pip install python-magic

Official References