Handling File Attachments in Genesys Cloud Web Messaging
What You Will Build
- You will build a backend service that validates and processes file attachments sent by customers via the Genesys Cloud Web Messaging API.
- This tutorial uses the Genesys Cloud Conversations API and the Web Messaging specific endpoints to retrieve attachment metadata and download binary content.
- The implementation is written in Python using the official
genesyscloudSDK andhttpxfor direct HTTP requests where the SDK lacks specific attachment download methods.
Prerequisites
- OAuth Client: A Genesys Cloud OAuth client with the following scopes:
conversation:readwebchat:readfile:read(Required for downloading attachment content)
- SDK Version:
genesyscloudPython SDK v13.0.0 or higher. - Runtime: Python 3.9+
- Dependencies:
genesyscloudhttpxpython-magic(For MIME type validation on the server side)
pip install genesyscloud httpx python-magic
Authentication Setup
Genesys Cloud APIs require OAuth 2.0. For server-to-server integrations, use the Client Credentials Grant flow. You must cache the access token and refresh it before expiration to avoid 401 Unauthorized errors.
import httpx
import json
import os
from datetime import datetime, timezone
class GenesysAuth:
def __init__(self, env: str = "mypurecloud.com"):
self.env = env
self.client_id = os.getenv("GENESYS_CLIENT_ID")
self.client_secret = os.getenv("GENESYS_CLIENT_SECRET")
self.token_url = f"https://api.{env}/v2/oauth/token"
self.access_token = None
self.expires_at = None
def get_access_token(self) -> str:
"""
Returns a valid access token. Generates a new one if expired or missing.
"""
if self.access_token and self.expires_at and datetime.now(timezone.utc) < self.expires_at:
return self.access_token
headers = {
"Content-Type": "application/x-www-form-urlencoded",
}
data = {
"grant_type": "client_credentials",
"client_id": self.client_id,
"client_secret": self.client_secret,
}
with httpx.Client() as client:
response = client.post(self.token_url, headers=headers, data=data)
response.raise_for_status()
token_data = response.json()
self.access_token = token_data["access_token"]
# Expires in seconds, convert to UTC datetime
expires_in = token_data.get("expires_in", 3600)
self.expires_at = datetime.now(timezone.utc).replace(microsecond=0) + __import__('datetime').timedelta(seconds=expires_in)
return self.access_token
# Initialize Auth
auth = GenesysAuth()
Implementation
Step 1: Retrieve Conversation and Attachment Metadata
When a customer uploads a file in Web Messaging, Genesys Cloud stores the file in its secure storage and returns metadata in the attachments array of the message object. You do not download the file immediately upon receipt of the webhook or event; instead, you process the metadata to validate compliance before fetching the binary.
The attachments object contains:
id: The unique identifier for the attachment.name: The original filename provided by the user.size: The file size in bytes.contentType: The MIME type reported by the client browser.
We will use the PureCloudPlatformClientV2 SDK to fetch conversation details.
from genesyscloud.rest import PureCloudPlatformClientV2
from genesyscloud.api.conversations_api import ConversationsApi
def get_conversation_attachments(conversation_id: str) -> list:
"""
Fetches a conversation and extracts attachment metadata.
"""
client = PureCloudPlatformClientV2()
client.set_access_token(auth.get_access_token())
conversations_api = ConversationsApi(client)
try:
# Fetch the specific conversation
response = conversations_api.get_conversations_conversation(
conversation_id=conversation_id,
expand=["attachments"] # Critical: Must expand attachments to see metadata
)
# Iterate through messages to find attachments
attachments = []
if response.messages:
for message in response.messages:
if message.attachments:
for attachment in message.attachments:
attachments.append({
"id": attachment.id,
"name": attachment.name,
"size": attachment.size,
"content_type": attachment.content_type,
"url": attachment.url # Presigned URL for download
})
return attachments
except Exception as e:
print(f"Error fetching conversation: {e}")
return []
# Example usage
# conv_id = "your-conversation-id"
# atts = get_conversation_attachments(conv_id)
Important Note on expand: The expand=["attachments"] parameter is mandatory. Without it, the API returns message objects with attachments set to null or an empty list, regardless of whether files were uploaded.
Step 2: Validate MIME Types and Size Limits
Genesys Cloud Web Messaging has default constraints, but you should enforce your own business logic. The platform supports common web formats. However, the contentType field in the metadata is provided by the client browser and must not be trusted for security purposes. You must validate the actual file content after download.
Standard Web Messaging limits (subject to change, verify in admin console):
- Max File Size: Typically 10 MB per file.
- Allowed MIME Types:
image/png,image/jpeg,image/gif,application/pdf,text/plain,application/msword,application/vnd.openxmlformats-officedocument.wordprocessingml.document.
We will create a validator that checks the reported size and prepares a list of allowed MIME types.
import magic
ALLOWED_MIME_TYPES = {
"image/png",
"image/jpeg",
"image/gif",
"application/pdf",
"text/plain",
"application/msword",
"application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
}
MAX_FILE_SIZE_BYTES = 10 * 1024 * 1024 # 10 MB
def validate_attachment_metadata(attachment: dict) -> tuple[bool, str]:
"""
Validates the metadata of an attachment before downloading.
Returns (is_valid, error_message).
"""
# Check Size
if attachment["size"] > MAX_FILE_SIZE_BYTES:
return False, f"File size {attachment['size']} exceeds limit of {MAX_FILE_SIZE_BYTES} bytes."
# Check Reported Content Type (Pre-check)
# Note: This is a soft check. The hard check happens after download.
if attachment["content_type"] not in ALLOWED_MIME_TYPES:
# Some browsers send generic types like 'application/octet-stream'
# We will allow download for binary inspection but flag it.
if attachment["content_type"] != "application/octet-stream":
return False, f"Content type {attachment['content_type']} is not allowed."
return True, "Metadata valid"
Step 3: Download and Verify File Content
The attachments metadata includes a url field. This is a presigned URL that expires after a short period. You must use this URL to download the file. The SDK does not provide a direct method to stream content from this presigned URL, so we use httpx.
After downloading, use python-magic to inspect the binary header and determine the true MIME type. This prevents users from renaming a .exe to .pdf to bypass filters.
import httpx
def download_and_verify_attachment(attachment: dict) -> tuple[bool, bytes, str]:
"""
Downloads the file using the presigned URL and verifies the actual MIME type.
Returns (is_valid, file_content, detected_mime_type).
"""
download_url = attachment["url"]
# Download the file
with httpx.Client() as client:
try:
response = client.get(download_url)
response.raise_for_status()
file_content = response.content
except httpx.HTTPStatusError as e:
if e.response.status_code == 403:
return False, None, "Presigned URL expired. Request a new conversation fetch."
elif e.response.status_code == 404:
return False, None, "File not found in storage."
raise
# Detect actual MIME type using binary inspection
detected_mime = magic.from_buffer(file_content, mime=True)
# Validate against allowed types
if detected_mime not in ALLOWED_MIME_TYPES:
return False, file_content, f"Blocked: Actual MIME type {detected_mime} is not allowed."
return True, file_content, detected_mime
Step 4: Processing the Result
Once the file is validated, you can save it to your local storage, pass it to a virus scanning service, or store it in an external blob store (S3, Azure Blob).
import os
import uuid
def save_validated_file(file_content: bytes, original_name: str, detected_mime: str) -> str:
"""
Saves the file to a local directory with a sanitized name.
"""
# Generate a unique filename to prevent collisions and directory traversal
file_ext = os.path.splitext(original_name)[1]
safe_filename = f"{uuid.uuid4().hex}{file_ext}"
save_dir = "./uploaded_attachments"
os.makedirs(save_dir, exist_ok=True)
file_path = os.path.join(save_dir, safe_filename)
with open(file_path, "wb") as f:
f.write(file_content)
return file_path
Complete Working Example
This script simulates the end-to-end flow: authenticating, fetching conversation attachments, validating metadata, downloading the file, verifying the binary content, and saving it.
import os
import uuid
import magic
import httpx
from datetime import datetime, timezone
from genesyscloud.rest import PureCloudPlatformClientV2
from genesyscloud.api.conversations_api import ConversationsApi
# --- Configuration ---
ALLOWED_MIME_TYPES = {
"image/png", "image/jpeg", "image/gif", "application/pdf", "text/plain"
}
MAX_FILE_SIZE_BYTES = 10 * 1024 * 1024 # 10 MB
SAVE_DIR = "./secure_uploads"
# --- Authentication ---
class GenesysAuth:
def __init__(self, env: str = "mypurecloud.com"):
self.env = env
self.client_id = os.getenv("GENESYS_CLIENT_ID")
self.client_secret = os.getenv("GENESYS_CLIENT_SECRET")
self.token_url = f"https://api.{env}/v2/oauth/token"
self.access_token = None
self.expires_at = None
def get_access_token(self) -> str:
if self.access_token and self.expires_at and datetime.now(timezone.utc) < self.expires_at:
return self.access_token
headers = {"Content-Type": "application/x-www-form-urlencoded"}
data = {
"grant_type": "client_credentials",
"client_id": self.client_id,
"client_secret": self.client_secret,
}
with httpx.Client() as client:
response = client.post(self.token_url, headers=headers, data=data)
response.raise_for_status()
token_data = response.json()
self.access_token = token_data["access_token"]
expires_in = token_data.get("expires_in", 3600)
self.expires_at = datetime.now(timezone.utc).replace(microsecond=0) + __import__('datetime').timedelta(seconds=expires_in)
return self.access_token
# --- Core Logic ---
def process_webchat_attachments(conversation_id: str):
auth = GenesysAuth()
token = auth.get_access_token()
# 1. Initialize SDK
client = PureCloudPlatformClientV2()
client.set_access_token(token)
conversations_api = ConversationsApi(client)
# 2. Fetch Conversation with Attachments Expanded
try:
response = conversations_api.get_conversations_conversation(
conversation_id=conversation_id,
expand=["attachments"]
)
except Exception as e:
print(f"Failed to fetch conversation: {e}")
return
if not response.messages:
print("No messages found.")
return
# 3. Iterate and Process Attachments
for message in response.messages:
if not message.attachments:
continue
for attachment in message.attachments:
print(f"Processing attachment: {attachment.name}")
# A. Validate Metadata
if attachment.size > MAX_FILE_SIZE_BYTES:
print(f" -> Skipped: Size {attachment.size} exceeds limit.")
continue
if attachment.content_type not in ALLOWED_MIME_TYPES and attachment.content_type != "application/octet-stream":
print(f" -> Skipped: Type {attachment.content_type} not allowed.")
continue
# B. Download File
try:
with httpx.Client() as http_client:
dl_response = http_client.get(attachment.url)
dl_response.raise_for_status()
file_content = dl_response.content
except httpx.HTTPStatusError as e:
print(f" -> Download Error: {e}")
continue
except Exception as e:
print(f" -> Network Error: {e}")
continue
# C. Verify Binary Content (MIME Magic)
detected_mime = magic.from_buffer(file_content, mime=True)
if detected_mime not in ALLOWED_MIME_TYPES:
print(f" -> Blocked: Detected MIME {detected_mime} is not allowed.")
continue
# D. Save File
os.makedirs(SAVE_DIR, exist_ok=True)
file_ext = os.path.splitext(attachment.name)[1]
safe_name = f"{uuid.uuid4().hex}{file_ext}"
file_path = os.path.join(SAVE_DIR, safe_name)
with open(file_path, "wb") as f:
f.write(file_content)
print(f" -> Saved successfully to {file_path}")
# --- Execution ---
if __name__ == "__main__":
# Replace with a real conversation ID that has an attachment
CONV_ID = os.getenv("TEST_CONVERSATION_ID")
if CONV_ID:
process_webchat_attachments(CONV_ID)
else:
print("Set TEST_CONVERSATION_ID environment variable.")
Common Errors & Debugging
Error: 403 Forbidden on Attachment URL
Cause: The presigned URL in the attachment.url field has expired. These URLs are typically valid for only a few minutes.
Fix: Re-fetch the conversation using get_conversations_conversation with expand=["attachments"] to get a fresh presigned URL.
Error: 401 Unauthorized
Cause: The OAuth token used for the initial conversation fetch is expired or invalid.
Fix: Ensure your GenesysAuth class checks expires_at and refreshes the token using the Client Credentials grant before making SDK calls.
Error: attachment.url is null
Cause: The expand=["attachments"] parameter was missing from the API call, or the attachment is still being processed by Genesys Cloud storage.
Fix: Always include expand=["attachments"]. If the URL is null after a successful fetch with expansion, wait a few seconds and retry. This indicates the file is still being written to the backend storage.
Error: Detected MIME type mismatch
Cause: The user uploaded a file with a misleading extension (e.g., malware.exe renamed to document.pdf). The python-magic library detects the true header.
Fix: This is a security feature. Reject the file and notify the user that the file format is invalid. Do not rely on the contentType field from the API metadata for security decisions.