Architecting Email Attachment Virus Scanning Integration with ClamAV and Cloud Services

Architecting Email Attachment Virus Scanning Integration with ClamAV and Cloud Services

What This Guide Covers

This guide details the architecture and implementation of a serverless virus scanning pipeline for Genesys Cloud CX email attachments. You will build an integration that intercepts incoming emails via Webhooks, extracts attachments, submits them to a ClamAV instance for analysis, and routes the email based on the security verdict. The end result is a secure email ingestion flow that blocks malicious payloads before they reach agent inboxes or downstream CRM systems.

Prerequisites, Roles & Licensing

Licensing and Roles

  • Genesys Cloud CX: Requires at least CX 2 licensing to access Webhooks and Routing capabilities.
  • Roles:
    • Admin > Integration > Edit to create and manage Webhooks.
    • Admin > Routing > Edit to configure Email Routing Rules.
    • Admin > Architect > Edit to design the flow logic.

Infrastructure Dependencies

  • ClamAV Instance: A running ClamAV daemon (clamd). This can be hosted on a dedicated VM, a container in Kubernetes, or a serverless function wrapper. The instance must expose a TCP socket or a REST API endpoint (e.g., via ClamD or a custom wrapper like ClamAV-REST).
  • Object Storage (Optional but Recommended): AWS S3, Azure Blob Storage, or GCP Cloud Storage for staging attachments during scanning.
  • TLS Certificate: For securing the webhook endpoint if you are hosting the ClamAV wrapper on a public-facing server.

Technical Stack

  • Language: Python 3.9+ or Node.js 18+ for the intermediary service.
  • Libraries: boto3 (AWS), requests (Python), axios (Node.js).
  • Protocols: HTTP/HTTPS (Webhooks), TCP (ClamAV daemon communication).

The Implementation Deep-Dive

1. Designing the Intermediary Scanning Service

Genesys Cloud Webhooks provide the payload data, but they do not execute arbitrary code. You must deploy an external service to receive the webhook, process the attachment, and communicate with ClamAV. This service acts as the bridge between the CCaaS platform and your security infrastructure.

Architectural Reasoning

We do not connect Genesys directly to ClamAV because ClamAV is a stateful, resource-intensive daemon. Sending raw HTTP traffic from Genesys to a ClamAV socket would expose your internal security infrastructure to the public internet and fail under concurrent load. Instead, we use a lightweight intermediary service. This service handles connection pooling, retry logic, and payload transformation. It also allows you to implement rate limiting to prevent ClamAV from being overwhelmed by a sudden spike in email volume.

The Trap: Blocking the Webhook Thread

The most common misconfiguration is performing the virus scan synchronously within the webhook handler and waiting for the ClamAV response before returning a 200 OK to Genesys. Genesys expects a webhook response within 5 seconds. If your ClamAV scan takes 10 seconds, Genesys will mark the webhook as failed and retry. This causes duplicate processing and potential data corruption.

Solution: Implement an asynchronous pattern.

  1. Genesys sends the webhook.
  2. Your service receives the payload and returns 200 OK immediately.
  3. Your service spawns a background thread or pushes the task to a message queue (e.g., AWS SQS, RabbitMQ).
  4. The worker process retrieves the attachment, scans it, and updates the Genesys ticket/email via the REST API.

Implementation Steps

Step 1: Expose the ClamAV Interface
Ensure your ClamAV daemon is accessible. If you are using the standard clamd, it listens on a TCP socket (default port 3310). For easier integration, wrap it in a lightweight REST API.

Example Python wrapper using clamd library:

from flask import Flask, request, jsonify
import clamd
import os

app = Flask(__name__)
cl = clamd.ClamdNetworkSocket('127.0.0.1', 3310)

@app.route('/scan', methods=['POST'])
def scan_file():
    if 'file' not in request.files:
        return jsonify({'error': 'No file part'}), 400
    
    file = request.files['file']
    if file.filename == '':
        return jsonify({'error': 'No selected file'}), 400
    
    # Save to temp directory
    temp_path = f'/tmp/{file.filename}'
    file.save(temp_path)
    
    try:
        # Scan the file
        status, details = cl.instream(open(temp_path, 'rb'))
        
        # Clean up
        os.remove(temp_path)
        
        if status == 'OK':
            return jsonify({'verdict': 'CLEAN', 'details': details})
        elif status == 'FOUND':
            return jsonify({'verdict': 'MALICIOUS', 'details': details})
        else:
            return jsonify({'verdict': 'ERROR', 'details': str(details)}), 500
            
    except Exception as e:
        if os.path.exists(temp_path):
            os.remove(temp_path)
        return jsonify({'verdict': 'ERROR', 'details': str(e)}), 500

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Step 2: Build the Webhook Handler
Create an endpoint that accepts the Genesys webhook payload. The payload contains the email metadata and attachment URLs.

from flask import Flask, request, jsonify
import requests
import threading
import base64

app = Flask(__name__)

GENESYS_SUBDOMAIN = 'your-subdomain'
GENESYS_TOKEN = 'your-bearer-token' # Use OAuth2 service account token
CLAMAV_URL = 'http://internal-clamav-service:5000/scan'

@app.route('/webhook/email-ingest', methods=['POST'])
def handle_webhook():
    data = request.json
    
    # Immediate acknowledgment to Genesys
    # Do not process here. Return 200 OK.
    
    # Extract attachment info
    attachments = data.get('attachments', [])
    email_id = data.get('emailId')
    
    if not attachments:
        return jsonify({'status': 'accepted'}), 200
    
    # Spawn background thread for processing
    thread = threading.Thread(target=process_attachments, args=(attachments, email_id))
    thread.start()
    
    return jsonify({'status': 'accepted'}), 200

def process_attachments(attachments, email_id):
    for att in attachments:
        attachment_id = att['id']
        content_url = att['contentUrl']
        
        # Download attachment
        headers = {'Authorization': f'Bearer {GENESYS_TOKEN}'}
        response = requests.get(content_url, headers=headers)
        
        if response.status_code != 200:
            continue
            
        # Send to ClamAV
        files = {'file': (att['fileName'], response.content, att['contentType'])}
        scan_response = requests.post(CLAMAV_URL, files=files)
        
        if scan_response.status_code == 200:
            result = scan_response.json()
            verdict = result.get('verdict')
            
            if verdict == 'MALICIOUS':
                quarantine_email(email_id, attachment_id, result.get('details'))
            elif verdict == 'ERROR':
                # Optional: Quarantine on error to be safe
                quarantine_email(email_id, attachment_id, 'Scan Error')

def quarantine_email(email_id, attachment_id, reason):
    # Call Genesys API to add note or route to quarantine queue
    url = f'https://{GENESYS_SUBDOMAIN}.mypurecloud.com/api/v2/routing/email/addresses/{email_id}/notes'
    headers = {
        'Authorization': f'Bearer {GENESYS_TOKEN}',
        'Content-Type': 'application/json'
    }
    payload = {
        "body": f"ATTACHMENT QUARANTINED: {attachment_id} - Reason: {reason}",
        "isPrivate": True
    }
    requests.post(url, json=payload, headers=headers)

2. Configuring Genesys Cloud Webhooks

You must configure Genesys to send the email data to your service when a new email arrives.

Architectural Reasoning

We use the routing.emails.created event. This event fires when an email is ingested into Genesys. We do not use routing.emails.updated because that fires too frequently (on every status change) and would cause excessive scanning overhead. By using created, we ensure each email is scanned exactly once upon ingestion.

The Trap: Missing Attachment Payloads

By default, the routing.emails.created webhook payload does not include the full attachment binary data. It only includes metadata (ID, Name, Size, Content-Type) and a contentUrl. If you attempt to process the attachment directly from the webhook payload without making a subsequent API call to download it, your scan will fail. Always verify that your service has the necessary permissions to access the contentUrl.

Implementation Steps

  1. Navigate to Admin > Integrations > Webhooks.
  2. Click Add Webhook.
  3. Name: Email-Virus-Scan-Ingest.
  4. Event Type: Select routing.emails.created.
  5. Endpoint URL: Enter the public URL of your Flask/Node.js service (e.g., https://your-service.com/webhook/email-ingest).
  6. Method: POST.
  7. Headers: Add Content-Type: application/json.
  8. Payload Configuration:
    • Ensure the payload includes attachments.
    • You may need to enable specific fields in the payload builder if using a custom payload. The default payload usually includes the attachment list.
  9. Click Save.

3. Implementing Routing Logic for Quarantine

Once the attachment is scanned, you need to route the email appropriately. If the scan is clean, the email proceeds normally. If malicious, it must be diverted.

Architectural Reasoning

We do not block the email at the webhook level because the webhook is asynchronous. The email is already in the Genesys system. Instead, we use Routing Rules to redirect emails based on attributes. Our webhook service updates the email with a private note or a custom attribute (if using Custom Attributes via API) indicating the scan status. The Routing Rule then checks this attribute.

Alternatively, a more robust approach is to use the Architect flow. We can pause the email in a “Holding” queue, perform the scan, and then use an API Step in Architect to query the scan result and route accordingly. However, the Webhook + Routing Rule approach is simpler and less prone to timeout issues if the scan takes longer than the Architect step limit.

For this guide, we will use the Routing Rule approach with a Custom Attribute.

The Trap: Race Conditions in Routing

If the Routing Rule evaluates the custom attribute before the webhook service has updated it, the email will be routed incorrectly. To mitigate this, you can add a small delay in the routing rule or use a “Staging” queue.

Solution: Use a Staging Queue.

  1. Route all incoming emails to a “Scan-Pending” queue.
  2. The webhook service scans the attachment and updates the custom attribute.
  3. A second Routing Rule triggers when the custom attribute changes or after a short delay, moving the email to “Inbox” or “Quarantine”.

However, Genesys Routing Rules do not easily trigger on custom attribute changes for emails in a queue. A better approach is:

  1. Webhook: Scans attachment.
  2. Webhook: If malicious, calls Genesys API to move the email to a “Quarantine” queue directly.
  3. Default Routing: If no webhook action is taken within X seconds (handled by a timeout in your service), the email moves to the normal queue.

Let us refine the webhook service to handle routing directly.

Update the quarantine_email function in your service:

def quarantine_email(email_id, attachment_id, reason):
    # Move email to Quarantine Queue
    quarantine_queue_id = 'your-quarantine-queue-id'
    
    url = f'https://{GENESYS_SUBDOMAIN}.mypurecloud.com/api/v2/routing/email/addresses/{email_id}'
    headers = {
        'Authorization': f'Bearer {GENESYS_TOKEN}',
        'Content-Type': 'application/json'
    }
    
    # Update the email to move it to the quarantine queue
    # Note: This requires the email to be in a state that allows movement
    payload = {
        "routing": {
            "queueId": quarantine_queue_id,
            "type": "queue"
        }
    }
    
    try:
        response = requests.put(url, json=payload, headers=headers)
        if response.status_code != 200:
            print(f"Failed to move email {email_id} to quarantine: {response.text}")
    except Exception as e:
        print(f"Error moving email: {e}")

4. Securing the ClamAV Instance

Your ClamAV instance must be secured to prevent unauthorized access and resource exhaustion.

Architectural Reasoning

ClamAV is a CPU-intensive application. If an attacker discovers your ClamAV endpoint, they can send thousands of large files to trigger a Denial of Service (DoS). Therefore, the ClamAV service must never be exposed to the public internet. It should only be accessible from your intermediary scanning service.

The Trap: No Rate Limiting

If you do not implement rate limiting on your intermediary service, a burst of emails can overwhelm ClamAV, causing scans to timeout and fail. Failed scans should default to “Quarantine” to ensure security, but this can lead to false positives.

Solution: Implement rate limiting and queueing.

  • Use a message queue (e.g., Redis, RabbitMQ) to buffer scan requests.
  • Set a maximum concurrency limit for ClamAV scans (e.g., 10 concurrent scans).
  • If the queue is full, drop the scan request but log it for manual review, or queue it for later processing.

Validation, Edge Cases & Troubleshooting

Edge Case 1: Large Attachments

The Failure Condition: An email with a 50MB attachment causes the webhook handler to timeout or the ClamAV scan to exceed memory limits.
The Root Cause: Genesys Cloud has limits on attachment sizes (typically 25MB for email, but can vary by carrier). However, if you are integrating with external carriers that allow larger attachments, your service must handle them. ClamAV may run out of memory if scanning very large files.
The Solution:

  1. Check the attachment size in the webhook payload.
  2. If the size exceeds a threshold (e.g., 20MB), skip the scan and flag for manual review.
  3. Configure ClamAV memory limits (MaxScanSize, MaxFileSize) in clamd.conf to match your infrastructure capabilities.
# clamd.conf
MaxScanSize 25M
MaxFileSize 25M
MaxFiles 1500

Edge Case 2: Encrypted Archives

The Failure Condition: ClamAV returns “OK” for a password-protected ZIP file containing malware.
The Root Cause: ClamAV cannot scan encrypted archives. It will skip the contents and report the archive itself as clean.
The Solution:

  1. Configure ClamAV to block encrypted archives entirely.
  2. In clamd.conf, set ZipMaxFiles, ZipMaxMem, and BlockEncryptedArchives yes.
# clamd.conf
BlockEncryptedArchives yes

This ensures that any encrypted archive is flagged as malicious or blocked, forcing the user to send unencrypted files for scanning.

Edge Case 3: Webhook Delivery Failures

The Failure Condition: Your scanning service is down, and Genesys fails to deliver the webhook.
The Root Cause: Network outage, service crash, or SSL certificate expiration.
The Solution:

  1. Implement a health check endpoint in your service (/health) that Genesys can monitor.
  2. Configure Genesys Webhook retries. By default, Genesys retries failed webhooks. Ensure your service is idempotent (handling duplicate webhook calls correctly).
  3. Use a dead-letter queue (DLQ) in your message broker to capture failed scan requests for manual review.

Official References