Designing SDK Mock Server Implementations for Offline Development and Unit Testing

Designing SDK Mock Server Implementations for Offline Development and Unit Testing

What This Guide Covers

You will build a deterministic mock server that intercepts CCaaS SDK network calls, simulates authentication, media negotiation, and routing events, and enables fully offline unit testing with controlled failure injection. The end result is a local development environment that mirrors production transport behavior without consuming platform licenses, external carrier capacity, or network bandwidth.

Prerequisites, Roles & Licensing

  • Licensing Tiers: Genesys Cloud CX 2 or CX 3, NICE CXone Professional or Enterprise. SDK mocking requires no production seat consumption but assumes your development environment holds valid SDK distribution access.
  • Permission Strings: Developer > Application > Edit, Telephony > Trunk > View, Reporting > Analytics > Read. These permissions allow you to inspect production traffic patterns and extract exact request/response schemas for mocking.
  • OAuth Scopes: login:read, phone:call:read, phone:call:write, webchat:read, user:read, presence:read. These scopes dictate the boundary of your mock server capabilities.
  • External Dependencies: Node.js 18 LTS, Express 4.x, ws 8.x, http-proxy-middleware 2.x, Jest or Mocha testing framework, nock or msw (optional for HTTP-only fallbacks).
  • External Dependencies (Platform): Access to production or sandbox environment traffic logs for schema extraction. You need exact JSON structures for Genesys Cloud Web SDK or NICE CXone CCaaS SDK before mocking begins.

The Implementation Deep-Dive

1. Transport Layer Interception and Reverse Proxy Architecture

SDK unit tests fail when they rely on class-level stubbing. Platform SDKs update internal method signatures, rename private properties, and refactor WebSocket handlers without notice. Class-level mocks break immediately. The correct approach is transport-layer interception. You build a local reverse proxy that sits between your application code and the platform endpoints. The SDK believes it is communicating with api.mypurecloud.com or api.nice-incontact.com. Your mock server intercepts the request, evaluates the path, and returns a controlled response.

You configure your development environment to route SDK traffic through a local host alias. Update your local hosts file or use a proxy configuration in your test runner. The mock server listens on port 9443 for HTTPS traffic and port 8080 for HTTP fallback. You disable certificate validation in the test environment only. Production code never ships with disabled validation.

// mock-server/index.js
import express from 'express';
import { createServer } from 'http';
import { WebSocketServer } from 'ws';
import { createProxyMiddleware } from 'http-proxy-middleware';

const app = express();
const server = createServer(app);
const wss = new WebSocketServer({ noServer: true });

// HTTP Interception Router
app.use('/api/v2', express.json(), (req, res, next) => {
  if (req.path.startsWith('/authorization/oauth/token')) {
    return require('./routes/auth')(req, res);
  }
  if (req.path.startsWith('/telephony/v2/outbound/calls')) {
    return require('./routes/telephony')(req, res);
  }
  res.status(404).json({ error: 'MOCK_NOT_IMPLEMENTED', path: req.path });
});

// WebSocket Upgrade Handler
server.on('upgrade', (request, socket, head) => {
  const url = new URL(request.url, `http://${request.headers.host}`);
  if (url.pathname === '/ws/v1') {
    wss.handleUpgrade(request, socket, head, (ws) => {
      wss.emit('connection', ws, request);
    });
  } else {
    socket.destroy();
  }
});

wss.on('connection', (ws) => {
  require('./handlers/websocket')(ws);
});

server.listen(9443, () => console.log('Mock transport layer active on :9443'));

The Trap: Developers configure the mock server to return static JSON files for every endpoint. Static responses ignore request body validation, skip state tracking, and fail to simulate platform rate limiting. When your application sends a malformed payload or attempts a rapid sequence of calls, the mock server returns success anyway. Your unit tests pass. Production fails immediately due to platform validation rules.

Architectural Reasoning: You must implement request validation and stateful routing. The mock server inspects Content-Type, validates required fields, and tracks session state. You mirror platform behavior by returning 400 Bad Request for missing fields, 429 Too Many Requests after five calls per second, and 401 Unauthorized when tokens expire. This forces your application code to handle validation errors gracefully. You build resilience into your client, not your test suite.

2. OAuth2 Lifecycle Simulation and Token Expiry Enforcement

CCaaS platforms use OAuth 2.0 for SDK authentication. Genesys Cloud uses client credentials flow for service accounts and authorization code flow for user sessions. NICE CXone uses similar patterns with platform-specific grant types. Your mock server must simulate the complete token lifecycle. You generate JWT-like tokens with controlled exp claims. You enforce refresh logic by returning 401 responses after the TTL expires. You simulate refresh token rotation by invalidating the old refresh token and issuing a new one.

// POST /api/v2/authorization/oauth/token
// Request Body
{
  "grant_type": "client_credentials",
  "client_id": "dev-sdk-mock-client",
  "client_secret": "mock-secret-unsafe-for-prod"
}

// Response Body (200 OK)
{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwiaWF0IjoxNTE2MjM5MDIyLCJleHAiOjE1MTYyNDI2MjJ9.mock_signature",
  "token_type": "Bearer",
  "expires_in": 3600,
  "refresh_token": "dGhpcyBpcyBhIG1vY2sgcmVmcmVzaCB0b2tlbg.mock",
  "scope": "login:read phone:call:read phone:call:write"
}

You store active tokens in an in-memory map keyed by client_id. You attach a decay timer to each entry. When the timer fires, you mark the token as expired. Subsequent requests with that token receive 401. You force your application to execute the refresh path. You verify that your application handles the race condition where a request fails with 401 while a refresh is already in progress.

The Trap: Developers mock authentication by returning a static token that never expires. The application never triggers the refresh logic. Token rotation bugs, concurrent request locking during refresh, and stale token cache evictions remain undetected. When deployed to production, your application experiences cascading 401 failures during peak load because the refresh path was never exercised.

Architectural Reasoning: You simulate token expiry deterministically. You expose a test harness endpoint that allows you to trigger expiry on demand. You verify that your application implements a refresh queue. You ensure that concurrent requests wait for the refresh operation to complete rather than spawning parallel refresh calls. This matches platform behavior exactly. The platform rejects duplicate refresh requests and returns 409 Conflict or queues them internally. Your mock server mimics this by rejecting concurrent refresh attempts for the same client_id until the first completes.

3. WebSocket Handshake and Realtime Event Throttling

Realtime communication in CCaaS platforms relies on WebSockets. Genesys Cloud uses /ws/v1 for presence, routing, and media control. NICE CXone uses /ws/v1/stream for similar functionality. Your mock server must handle the WebSocket handshake, maintain session state, and simulate event streams. You implement a handshake validator that checks the Sec-WebSocket-Key header and returns the correct Sec-WebSocket-Accept response. You then route incoming messages to a state machine.

You simulate platform events by pushing JSON frames to the client. You throttle event delivery to match SDK processing limits. The SDK processes events asynchronously. If you push events faster than the SDK can consume them, the internal buffer overflows. The SDK drops frames. Your tests pass locally but fail in CI where resource contention exists.

// mock-server/handlers/websocket.js
import { EventEmitter } from 'events';

const sessionStore = new Map();

export default function handleWebSocket(ws) {
  const sessionId = `mock-session-${Date.now()}`;
  const stateMachine = new EventEmitter();
  sessionStore.set(sessionId, { ws, stateMachine, eventsQueued: 0 });

  ws.on('message', (data) => {
    const payload = JSON.parse(data.toString());
    if (payload.type === 'SUBSCRIBE') {
      ws.send(JSON.stringify({ type: 'SUBSCRIBED', channels: payload.channels }));
    }
    if (payload.type === 'PING') {
      ws.send(JSON.stringify({ type: 'PONG', timestamp: Date.now() }));
    }
  });

  // Expose test control methods
  stateMachine.on('PUSH_EVENT', (event) => {
    // Implement backpressure throttling
    const session = sessionStore.get(sessionId);
    if (session.eventsQueued > 10) {
      setTimeout(() => ws.send(JSON.stringify(event)), 100);
    } else {
      ws.send(JSON.stringify(event));
      session.eventsQueued++;
    }
  });

  ws.on('close', () => sessionStore.delete(sessionId));
}

The Trap: Developers fire WebSocket events synchronously without backpressure handling. The test runner executes faster than the SDK event loop. Events arrive in bursts. The SDK drops frames due to internal queue limits. Your assertions verify that events were received, but the order is wrong or frames are missing. Tests become flaky. You waste hours debugging timing issues that do not exist in production because production network latency naturally throttles the stream.

Architectural Reasoning: You implement a deterministic event queue with explicit backpressure. You expose a pushEvent(event, delay) method in your test harness. You control the exact timing of each frame. You simulate network jitter by adding random delays within a bounded range. You verify that your application handles out-of-order events, duplicate frames, and missed heartbeats. You match platform behavior by implementing a heartbeat mechanism. The mock server sends PING frames every 15 seconds. If the client does not respond with PONG within 5 seconds, the server closes the connection. This forces your application to implement keep-alive logic.

4. Deterministic State Machine and Test Orchestration

CCaaS call flows follow strict state transitions. A call moves from INIT to RINGING to ANSWERED to HOLD to COMPLETE. Your mock server must track these states and prevent invalid transitions. You build a state machine that validates transitions before emitting events. You expose a test orchestration API that allows your unit tests to drive state changes explicitly.

// State Machine Transition Request (Test Orchestration API)
// POST /mock/admin/session/:sessionId/transition
{
  "from": "RINGING",
  "to": "ANSWERED",
  "metadata": {
    "callId": "c-123456",
    "direction": "outbound",
    "carrierResponse": "200 OK"
  }
}

You reject invalid transitions. If a test attempts to move from INIT directly to COMPLETE, the mock server returns 422 Unprocessable Entity. You force your application to handle state validation errors. You simulate platform delays by holding transitions until a release signal is sent. This allows you to test timeout logic, retry mechanisms, and user interface state synchronization.

The Trap: Developers allow free-form state transitions in their mock server. The test suite drives the mock server with arbitrary transitions. The application code receives events in impossible sequences. Your tests pass because the mock server accepts everything. Production fails when the platform rejects invalid transitions. You discover race conditions where your application assumes a call is ANSWERED before the platform confirms the handshake.

Architectural Reasoning: You enforce strict state validation. You mirror platform transition rules exactly. You document the allowed transitions in your test harness. You verify that your application waits for explicit platform confirmation before proceeding. You simulate carrier delays by holding the ANSWERED transition for a configurable duration. You test that your application handles timeout scenarios gracefully. You verify that your application releases resources when a call transitions to FAILED or TIMEOUT. This matches production behavior where carrier networks introduce latency and platform state machines enforce strict progression rules.

5. Failure Injection and Transport Degradation Patterns

Production environments experience network degradation, carrier failures, and platform outages. Your mock server must simulate these conditions deterministically. You implement failure injection at the transport layer. You simulate HTTP 503 Service Unavailable, WebSocket disconnects, and rate limiting. You control the injection window and recovery behavior.

// Failure Injection Middleware
function injectFailure(req, res, next) {
  const failureConfig = req.query.failure;
  if (failureConfig === 'timeout') {
    setTimeout(() => res.status(504).json({ error: 'GATEWAY_TIMEOUT' }), 5000);
    return;
  }
  if (failureConfig === 'rate_limit') {
    res.set('Retry-After', '10');
    res.status(429).json({ error: 'RATE_LIMIT_EXCEEDED' });
    return;
  }
  if (failureConfig === 'socket_drop') {
    next();
    // WebSocket handler will force disconnect after N frames
    return;
  }
  next();
}

You simulate carrier degradation by injecting latency into HTTP responses. You simulate WebSocket instability by closing the connection randomly after a configurable number of frames. You verify that your application implements exponential backoff, retry queues, and graceful degradation. You test that your application notifies users when the platform becomes unreachable.

The Trap: Developers test only happy paths. The mock server returns success for every request. The application never encounters network failures. When deployed to production, your application hangs indefinitely during platform outages. Users see frozen interfaces. Resources leak. Support tickets spike. You discover that your application lacks circuit breaker logic and retry limits.

Architectural Reasoning: You treat failure injection as a first-class requirement. You simulate platform outages by returning 5xx errors consistently for a configurable window. You verify that your application implements circuit breaker patterns. You test that your application switches to offline mode when the platform becomes unreachable. You verify that your application queues actions and replays them when connectivity is restored. You match platform behavior by implementing retry-after headers and exponential backoff guidance. You force your application to handle degradation gracefully rather than failing catastrophically.

Validation, Edge Cases & Troubleshooting

Edge Case 1: Virtual Clock Drift in Token Validation

The Failure Condition: Unit tests fail intermittently with 401 Unauthorized errors despite valid tokens. The mock server rejects tokens that appear unexpired.
The Root Cause: Your mock server uses Date.now() for token expiry validation. The CI runner experiences clock drift or timezone mismatch. The virtual clock in your test harness diverges from system time. Tokens expire prematurely.
The Solution: Decouple token validation from system time. Implement a virtual clock in your test harness. Pass the virtual timestamp to the mock server via a header or shared state. Validate token expiry against the virtual clock. This ensures deterministic behavior across all environments. You reference the WFM scheduling virtual clock pattern when designing time-sensitive test harnesses.

Edge Case 2: Concurrent Session Exhaustion in CI Pipelines

The Failure Condition: Parallel test suites fail with WebSocket connection refused or Session limit exceeded errors.
The Root Cause: Your mock server maintains session state in memory. Each test spawns a new session. CI pipelines run 50 parallel test files. The in-memory store exhausts available handles or triggers garbage collection pauses. WebSocket connections drop.
The Solution: Implement session pooling and explicit cleanup. Limit concurrent sessions to a configurable maximum. Reject new sessions when the limit is reached. Force tests to wait or reuse existing sessions. Implement automatic session teardown after test completion. Use weak references for session metadata to allow garbage collection. You verify that your application handles session exhaustion gracefully by returning 503 and queuing requests.

Edge Case 3: SDK Version Mismatch and Deprecated Endpoint Handling

The Failure Condition: Mock server returns 404 Not Found for endpoints that existed in previous SDK versions. Tests pass locally but fail in CI where SDK versions differ.
The Root Cause: Platform SDKs deprecate endpoints without backward compatibility. Your mock server implements only the latest API version. Older SDK versions in the test matrix call deprecated paths.
The Solution: Implement versioned routing in your mock server. Maintain parallel route handlers for supported SDK versions. Detect the SDK version from the User-Agent header or request metadata. Route requests to the appropriate handler. Document deprecation timelines. Force test suites to upgrade SDK versions before removing deprecated handlers. You verify that your application handles 410 Gone responses by falling back to alternative endpoints.

Official References