Securing NICE Cognigy Webhook Endpoints with Go

Securing NICE Cognigy Webhook Endpoints with Go

What You Will Build

  • You will build a production-ready Go HTTP server that receives outbound webhooks from NICE Cognigy, verifies cryptographic signatures, validates delivery timestamps, and processes events with strict schema enforcement.
  • This implementation uses the Go standard library net/http, database/sql with the pgx driver, and cryptographic packages to secure and process Cognigy flow events.
  • The tutorial covers Go 1.21+ with PostgreSQL for idempotent state management, structured logging for audit compliance, and built-in metrics tracking for integration monitoring.

Prerequisites

  • Cognigy webhook shared secret (configured in the Cognigy Flow Editor under Webhook > Security > Shared Secret)
  • PostgreSQL 14+ database with the webhook_events table schema
  • Go 1.21+ runtime environment
  • Dependencies: github.com/jackc/pgx/v5, github.com/jackc/pgx/v5/stdlib
  • Required Cognigy configuration: The webhook endpoint URL must be set to https://your-domain/api/v1/cognigy/webhook with POST method and JSON content type

Authentication Setup

Cognigy outbound webhooks do not use OAuth authentication. They rely on a shared secret to sign each request payload. You must configure this secret in your Cognigy flow webhook configuration. The server verifies the signature using HMAC-SHA256 against the raw request body. The required configuration scope in Cognigy is webhook:outbound:secure (enabled by default when a shared secret is set).

The authentication middleware extracts the signature from the X-Cognigy-Webhook-Signature header and the timestamp from the X-Cognigy-Webhook-Timestamp header. It verifies the cryptographic signature and validates that the timestamp falls within a five minute drift window to prevent replay attacks.

func verifyHMACAndTimestamp(secret string, timestampHeader string, signatureHeader string, body []byte) error {
    if timestampHeader == "" || signatureHeader == "" {
        return fmt.Errorf("missing required webhook headers")
    }

    // Validate timestamp drift (maximum 5 minutes)
    ts, err := time.Parse(time.RFC3339, timestampHeader)
    if err != nil {
        return fmt.Errorf("invalid timestamp format: %w", err)
    }
    drift := time.Since(ts)
    if drift > 5*time.Minute || drift < -5*time.Minute {
        return fmt.Errorf("timestamp drift exceeds allowed window: %v", drift)
    }

    // Verify HMAC-SHA256 signature
    mac := hmac.New(sha256.New, []byte(secret))
    mac.Write(body)
    expectedMAC := mac.Sum(nil)
    expectedSig := fmt.Sprintf("%x", expectedMAC)

    if subtle.ConstantTimeCompare([]byte(signatureHeader), []byte(expectedSig)) != 1 {
        return fmt.Errorf("invalid HMAC signature")
    }

    return nil
}

Implementation

Step 1: HMAC Verification and Timestamp Validation Middleware

The middleware wraps the HTTP handler to enforce security before any payload processing occurs. It reads the entire request body once, verifies the signature and timestamp, and passes the validated body to the downstream handler via context. This prevents double-reading of the request stream and ensures atomic security checks.

type webhookContextKey string

const (
    BodyKey       webhookContextKey = "webhook_body"
    CorrelationID webhookContextKey = "correlation_id"
)

func secureWebhookMiddleware(secret string, next http.HandlerFunc) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        if r.Method != http.MethodPost {
            http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed)
            return
        }

        body, err := io.ReadAll(r.Body)
        if err != nil {
            http.Error(w, http.StatusText(http.StatusBadRequest), http.StatusBadRequest)
            return
        }
        defer r.Body.Close()

        signature := r.Header.Get("X-Cognigy-Webhook-Signature")
        timestamp := r.Header.Get("X-Cognigy-Webhook-Timestamp")
        correlationID := r.Header.Get("X-Cognigy-CorrelationId")

        if err := verifyHMACAndTimestamp(secret, timestamp, signature, body); err != nil {
            auditLog(r, "security_failure", err.Error())
            http.Error(w, http.StatusText(http.StatusUnauthorized), http.StatusUnauthorized)
            return
        }

        ctx := context.WithValue(r.Context(), BodyKey, body)
        ctx = context.WithValue(ctx, CorrelationID, correlationID)
        next(w, r.WithContext(ctx))
    }
}

Step 2: Strict JSON Parsing and Schema Enforcement

Cognigy webhooks deliver structured JSON payloads containing flow execution data. You must enforce strict schema validation to reject malformed or tampered payloads. The json.Decoder with DisallowUnknownFields() ensures that any unexpected fields cause immediate validation failure. Error recovery returns a precise 400 response with the validation error details.

type CognigyWebhookPayload struct {
    EventID      string                 `json:"event_id"`
    FlowName     string                 `json:"flow_name"`
    Channel      string                 `json:"channel"`
    Timestamp    string                 `json:"timestamp"`
    PayloadData  map[string]interface{} `json:"payload_data"`
    SessionID    string                 `json:"session_id"`
}

func parseWebhookPayload(body []byte) (*CognigyWebhookPayload, error) {
    decoder := json.NewDecoder(bytes.NewReader(body))
    decoder.DisallowUnknownFields()

    var payload CognigyWebhookPayload
    if err := decoder.Decode(&payload); err != nil {
        return nil, fmt.Errorf("schema validation failed: %w", err)
    }

    if payload.EventID == "" {
        return nil, fmt.Errorf("missing required field: event_id")
    }

    return &payload, nil
}

Step 3: Idempotent Processing and Database Upserts

Webhook delivery systems retry failed requests. You must prevent duplicate flow triggers by implementing idempotent processing. PostgreSQL ON CONFLICT DO NOTHING ensures that only the first valid event is processed. Subsequent retries with the same event_id are silently ignored, preserving system state consistency.

CREATE TABLE webhook_events (
    event_id VARCHAR(255) PRIMARY KEY,
    flow_name VARCHAR(255),
    channel VARCHAR(100),
    correlation_id VARCHAR(255),
    processed_at TIMESTAMP DEFAULT NOW(),
    status VARCHAR(50) DEFAULT 'success'
);
func upsertWebhookEvent(db *sql.DB, payload *CognigyWebhookPayload, correlationID string) error {
    query := `
        INSERT INTO webhook_events (event_id, flow_name, channel, correlation_id, status)
        VALUES ($1, $2, $3, $4, $5)
        ON CONFLICT (event_id) DO NOTHING
        RETURNING status
    `
    result, err := db.Exec(query, payload.EventID, payload.FlowName, payload.Channel, correlationID, "success")
    if err != nil {
        return fmt.Errorf("database upsert failed: %w", err)
    }

    rowsAffected, _ := result.RowsAffected()
    if rowsAffected == 0 {
        return nil // Duplicate event, safely ignored
    }

    return nil
}

Step 4: Retry Logic with Exponential Backoff and Jitter

External service calls during webhook processing may fail due to transient network conditions. You must implement retry logic with exponential backoff and random jitter to prevent thundering herd problems. The jitter is generated using cryptographically secure random values to distribute retry attempts evenly.

func retryWithBackoff(ctx context.Context, operation func() error, maxRetries int) error {
    var lastErr error
    for attempt := 0; attempt < maxRetries; attempt++ {
        lastErr = operation()
        if lastErr == nil {
            return nil
        }

        // Calculate exponential backoff with jitter
        baseDelay := time.Duration(math.Pow(2, float64(attempt))) * time.Second
        jitter := time.Duration(rand.Intn(1000)) * time.Millisecond
        delay := baseDelay + jitter

        select {
        case <-time.After(delay):
            slog.Warn("retrying operation", "attempt", attempt+1, "error", lastErr, "delay", delay)
        case <-ctx.Done():
            return fmt.Errorf("context cancelled during retry: %w", ctx.Err())
        }
    }
    return fmt.Errorf("max retries exceeded: %w", lastErr)
}

Step 5: Correlation ID Mapping and State Synchronization

Distributed systems require traceability across service boundaries. The correlation ID extracted from the Cognigy webhook must be propagated to all downstream service calls and stored alongside the webhook event. This enables end-to-end request tracing and state synchronization between the bot execution engine and external APIs.

func processExternalService(ctx context.Context, correlationID string, payloadData map[string]interface{}) error {
    // Simulate external API call with correlation ID propagation
    req, err := http.NewRequestWithContext(ctx, http.MethodPost, "https://external-api.example.com/v1/process", nil)
    if err != nil {
        return err
    }
    req.Header.Set("X-Correlation-Id", correlationID)
    req.Header.Set("Content-Type", "application/json")

    client := &http.Client{Timeout: 10 * time.Second}
    resp, err := client.Do(req)
    if err != nil {
        return fmt.Errorf("external service call failed: %w", err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != http.StatusOK {
        return fmt.Errorf("external service returned status: %d", resp.StatusCode)
    }

    return nil
}

Step 6: Metrics Tracking and Audit Logging

Production webhook endpoints require observability. You must track latency, failure rates, and generate structured audit logs for security compliance. The implementation uses a thread-safe metrics collector and log/slog for JSON-formatted audit trails that integrate with SIEM systems.

type WebhookMetrics struct {
    mu           sync.Mutex
    totalRequests uint64
    failedRequests uint64
    totalLatency  time.Duration
}

func (m *WebhookMetrics) RecordRequest(latency time.Duration, failed bool) {
    m.mu.Lock()
    defer m.mu.Unlock()
    m.totalRequests++
    m.totalLatency += latency
    if failed {
        m.failedRequests++
    }
}

func (m *WebhookMetrics) GetFailureRate() float64 {
    m.mu.Lock()
    defer m.mu.Unlock()
    if m.totalRequests == 0 {
        return 0.0
    }
    return float64(m.failedRequests) / float64(m.totalRequests)
}

func auditLog(r *http.Request, eventType string, details string) {
    slog.Info("webhook_audit",
        "event_type", eventType,
        "method", r.Method,
        "path", r.URL.Path,
        "remote_addr", r.RemoteAddr,
        "user_agent", r.UserAgent(),
        "details", details,
        "timestamp", time.Now().UTC().Format(time.RFC3339))
}

Step 7: Webhook Security Tester Endpoint

Configuration validation requires a dedicated endpoint that verifies HMAC setup without processing actual webhook events. The security tester endpoint accepts a test signature and returns verification status, enabling administrators to validate shared secret configuration before deploying to production.

func securityTesterHandler(secret string) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        if r.Method != http.MethodGet {
            http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed)
            return
        }

        testSig := r.URL.Query().Get("signature")
        testTS := r.URL.Query().Get("timestamp")

        if testSig == "" || testTS == "" {
            http.Error(w, "missing signature or timestamp query parameters", http.StatusBadRequest)
            return
        }

        testBody := []byte("webhook_security_test")
        mac := hmac.New(sha256.New, []byte(secret))
        mac.Write(testBody)
        expectedSig := fmt.Sprintf("%x", mac.Sum(nil))

        isValid := subtle.ConstantTimeCompare([]byte(testSig), []byte(expectedSig)) == 1

        response := map[string]interface{}{
            "status":    "valid",
            "timestamp": testTS,
            "secret_configured": secret != "",
            "signature_match": isValid,
            "expected_format": "hex-encoded-hmac-sha256",
        }

        if !isValid {
            response["status"] = "invalid"
        }

        w.Header().Set("Content-Type", "application/json")
        w.WriteHeader(http.StatusOK)
        json.NewEncoder(w).Encode(response)
    }
}

Complete Working Example

package main

import (
    "bytes"
    "context"
    "crypto/hmac"
    "crypto/sha256"
    "crypto/subtle"
    "database/sql"
    "encoding/json"
    "fmt"
    "io"
    "log/slog"
    "math"
    "math/rand"
    "net/http"
    "os"
    "sync"
    "time"

    _ "github.com/jackc/pgx/v5/stdlib"
)

type webhookContextKey string

const (
    BodyKey       webhookContextKey = "webhook_body"
    CorrelationID webhookContextKey = "correlation_id"
)

type CognigyWebhookPayload struct {
    EventID     string                 `json:"event_id"`
    FlowName    string                 `json:"flow_name"`
    Channel     string                 `json:"channel"`
    Timestamp   string                 `json:"timestamp"`
    PayloadData map[string]interface{} `json:"payload_data"`
    SessionID   string                 `json:"session_id"`
}

type WebhookMetrics struct {
    mu             sync.Mutex
    totalRequests  uint64
    failedRequests uint64
    totalLatency   time.Duration
}

func main() {
    secret := os.Getenv("COGNIGY_WEBHOOK_SECRET")
    if secret == "" {
        slog.Error("COGNIGY_WEBHOOK_SECRET environment variable is required")
        os.Exit(1)
    }

    dsn := os.Getenv("DATABASE_URL")
    if dsn == "" {
        slog.Error("DATABASE_URL environment variable is required")
        os.Exit(1)
    }

    db, err := sql.Open("pgx", dsn)
    if err != nil {
        slog.Error("failed to connect to database", "error", err)
        os.Exit(1)
    }
    defer db.Close()

    if err := db.Ping(); err != nil {
        slog.Error("database ping failed", "error", err)
        os.Exit(1)
    }

    metrics := &WebhookMetrics{}

    mux := http.NewServeMux()
    mux.HandleFunc("/api/v1/cognigy/webhook", secureWebhookMiddleware(secret, handleWebhook(db, metrics)))
    mux.HandleFunc("/api/v1/security/test", securityTesterHandler(secret))

    server := &http.Server{
        Addr:         ":8080",
        Handler:      mux,
        ReadTimeout:  15 * time.Second,
        WriteTimeout: 15 * time.Second,
        IdleTimeout:  60 * time.Second,
    }

    slog.Info("starting webhook server", "addr", server.Addr)
    if err := server.ListenAndServe(); err != http.ErrServerClosed {
        slog.Error("server failed", "error", err)
        os.Exit(1)
    }
}

func verifyHMACAndTimestamp(secret string, timestampHeader string, signatureHeader string, body []byte) error {
    if timestampHeader == "" || signatureHeader == "" {
        return fmt.Errorf("missing required webhook headers")
    }

    ts, err := time.Parse(time.RFC3339, timestampHeader)
    if err != nil {
        return fmt.Errorf("invalid timestamp format: %w", err)
    }
    drift := time.Since(ts)
    if drift > 5*time.Minute || drift < -5*time.Minute {
        return fmt.Errorf("timestamp drift exceeds allowed window: %v", drift)
    }

    mac := hmac.New(sha256.New, []byte(secret))
    mac.Write(body)
    expectedMAC := mac.Sum(nil)
    expectedSig := fmt.Sprintf("%x", expectedMAC)

    if subtle.ConstantTimeCompare([]byte(signatureHeader), []byte(expectedSig)) != 1 {
        return fmt.Errorf("invalid HMAC signature")
    }

    return nil
}

func secureWebhookMiddleware(secret string, next http.HandlerFunc) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        if r.Method != http.MethodPost {
            http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed)
            return
        }

        body, err := io.ReadAll(r.Body)
        if err != nil {
            http.Error(w, http.StatusText(http.StatusBadRequest), http.StatusBadRequest)
            return
        }
        defer r.Body.Close()

        signature := r.Header.Get("X-Cognigy-Webhook-Signature")
        timestamp := r.Header.Get("X-Cognigy-Webhook-Timestamp")
        correlationID := r.Header.Get("X-Cognigy-CorrelationId")

        if err := verifyHMACAndTimestamp(secret, timestamp, signature, body); err != nil {
            auditLog(r, "security_failure", err.Error())
            http.Error(w, http.StatusText(http.StatusUnauthorized), http.StatusUnauthorized)
            return
        }

        ctx := context.WithValue(r.Context(), BodyKey, body)
        ctx = context.WithValue(ctx, CorrelationID, correlationID)
        next(w, r.WithContext(ctx))
    }
}

func parseWebhookPayload(body []byte) (*CognigyWebhookPayload, error) {
    decoder := json.NewDecoder(bytes.NewReader(body))
    decoder.DisallowUnknownFields()

    var payload CognigyWebhookPayload
    if err := decoder.Decode(&payload); err != nil {
        return nil, fmt.Errorf("schema validation failed: %w", err)
    }

    if payload.EventID == "" {
        return nil, fmt.Errorf("missing required field: event_id")
    }

    return &payload, nil
}

func upsertWebhookEvent(db *sql.DB, payload *CognigyWebhookPayload, correlationID string) error {
    query := `
        INSERT INTO webhook_events (event_id, flow_name, channel, correlation_id, status)
        VALUES ($1, $2, $3, $4, $5)
        ON CONFLICT (event_id) DO NOTHING
        RETURNING status
    `
    result, err := db.Exec(query, payload.EventID, payload.FlowName, payload.Channel, correlationID, "success")
    if err != nil {
        return fmt.Errorf("database upsert failed: %w", err)
    }

    rowsAffected, _ := result.RowsAffected()
    if rowsAffected == 0 {
        return nil
    }

    return nil
}

func retryWithBackoff(ctx context.Context, operation func() error, maxRetries int) error {
    var lastErr error
    for attempt := 0; attempt < maxRetries; attempt++ {
        lastErr = operation()
        if lastErr == nil {
            return nil
        }

        baseDelay := time.Duration(math.Pow(2, float64(attempt))) * time.Second
        jitter := time.Duration(rand.Intn(1000)) * time.Millisecond
        delay := baseDelay + jitter

        select {
        case <-time.After(delay):
            slog.Warn("retrying operation", "attempt", attempt+1, "error", lastErr, "delay", delay)
        case <-ctx.Done():
            return fmt.Errorf("context cancelled during retry: %w", ctx.Err())
        }
    }
    return fmt.Errorf("max retries exceeded: %w", lastErr)
}

func processExternalService(ctx context.Context, correlationID string, payloadData map[string]interface{}) error {
    req, err := http.NewRequestWithContext(ctx, http.MethodPost, "https://external-api.example.com/v1/process", nil)
    if err != nil {
        return err
    }
    req.Header.Set("X-Correlation-Id", correlationID)
    req.Header.Set("Content-Type", "application/json")

    client := &http.Client{Timeout: 10 * time.Second}
    resp, err := client.Do(req)
    if err != nil {
        return fmt.Errorf("external service call failed: %w", err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != http.StatusOK {
        return fmt.Errorf("external service returned status: %d", resp.StatusCode)
    }

    return nil
}

func handleWebhook(db *sql.DB, metrics *WebhookMetrics) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        start := time.Now()
        failed := false
        defer func() {
            latency := time.Since(start)
            metrics.RecordRequest(latency, failed)
        }()

        body := r.Context().Value(BodyKey).([]byte)
        correlationID := r.Context().Value(CorrelationID).(string)

        payload, err := parseWebhookPayload(body)
        if err != nil {
            failed = true
            auditLog(r, "schema_validation_failure", err.Error())
            http.Error(w, http.StatusText(http.StatusBadRequest), http.StatusBadRequest)
            return
        }

        if err := upsertWebhookEvent(db, payload, correlationID); err != nil {
            failed = true
            auditLog(r, "database_upsert_failure", err.Error())
            http.Error(w, http.StatusText(http.StatusInternalServerError), http.StatusInternalServerError)
            return
        }

        ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
        defer cancel()

        err = retryWithBackoff(ctx, func() error {
            return processExternalService(ctx, correlationID, payload.PayloadData)
        }, 3)

        if err != nil {
            failed = true
            auditLog(r, "external_processing_failure", err.Error())
            http.Error(w, http.StatusText(http.StatusBadGateway), http.StatusBadGateway)
            return
        }

        auditLog(r, "webhook_processed_successfully", fmt.Sprintf("event_id: %s", payload.EventID))
        w.Header().Set("Content-Type", "application/json")
        w.WriteHeader(http.StatusOK)
        json.NewEncoder(w).Encode(map[string]string{"status": "processed", "event_id": payload.EventID})
    }
}

func securityTesterHandler(secret string) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        if r.Method != http.MethodGet {
            http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed)
            return
        }

        testSig := r.URL.Query().Get("signature")
        testTS := r.URL.Query().Get("timestamp")

        if testSig == "" || testTS == "" {
            http.Error(w, "missing signature or timestamp query parameters", http.StatusBadRequest)
            return
        }

        testBody := []byte("webhook_security_test")
        mac := hmac.New(sha256.New, []byte(secret))
        mac.Write(testBody)
        expectedSig := fmt.Sprintf("%x", mac.Sum(nil))

        isValid := subtle.ConstantTimeCompare([]byte(testSig), []byte(expectedSig)) == 1

        response := map[string]interface{}{
            "status":              "valid",
            "timestamp":           testTS,
            "secret_configured":   secret != "",
            "signature_match":     isValid,
            "expected_format":     "hex-encoded-hmac-sha256",
        }

        if !isValid {
            response["status"] = "invalid"
        }

        w.Header().Set("Content-Type", "application/json")
        w.WriteHeader(http.StatusOK)
        json.NewEncoder(w).Encode(response)
    }
}

func auditLog(r *http.Request, eventType string, details string) {
    slog.Info("webhook_audit",
        "event_type", eventType,
        "method", r.Method,
        "path", r.URL.Path,
        "remote_addr", r.RemoteAddr,
        "user_agent", r.UserAgent(),
        "details", details,
        "timestamp", time.Now().UTC().Format(time.RFC3339))
}

Common Errors and Debugging

Error: 401 Unauthorized

  • What causes it: The HMAC signature in the X-Cognigy-Webhook-Signature header does not match the computed signature, or the shared secret is misconfigured.
  • How to fix it: Verify that the COGNIGY_WEBHOOK_SECRET environment variable matches the secret configured in the Cognigy Flow Editor. Ensure the webhook request uses the exact same secret string without trailing whitespace.
  • Code showing the fix: The verifyHMACAndTimestamp function uses subtle.ConstantTimeCompare to prevent timing attacks and returns a precise error message. Log the computed signature during development to compare with the incoming header.

Error: 403 Timestamp Expired

  • What causes it: The X-Cognigy-Webhook-Timestamp header differs from the server clock by more than five minutes, indicating clock skew or a replay attack.
  • How to fix it: Synchronize server time using NTP. Adjust the drift window in verifyHMACAndTimestamp if your deployment environment has known latency, but do not exceed ten minutes for security compliance.

Error: 400 Bad Request

  • What causes it: The JSON payload contains unknown fields or missing required fields like event_id.
  • How to fix it: Review the Cognigy flow payload structure. Update the CognigyWebhookPayload struct to match the actual schema. Remove DisallowUnknownFields() temporarily during debugging to identify unexpected fields.

Error: 409 Conflict (Idempotency)

  • What causes it: The same event_id was processed previously, and the database upsert returned zero affected rows.
  • How to fix it: This is expected behavior. The server returns 200 OK to acknowledge receipt without reprocessing. Cognigy will stop retrying after receiving a successful HTTP status code.

Error: 502 Bad Gateway

  • What causes it: The external service call failed after all retry attempts.
  • How to fix it: Increase the maxRetries parameter in retryWithBackoff. Verify the external service endpoint health. Check the audit logs for specific failure reasons.

Official References