Streaming Genesys Cloud LLM Gateway Responses to Web Clients via TypeScript SSE
What You Will Build
- This code establishes a Node.js endpoint that opens a persistent Server-Sent Events connection to a browser and forwards incremental LLM token chunks in real time.
- This implementation uses the Genesys Cloud LLM Gateway API streaming endpoint and the native HTTP stream interface.
- This tutorial covers TypeScript, Node.js, Express, and client-side EventSource integration.
Prerequisites
- OAuth 2.0 Machine-to-Machine client with the
ai:llm:gateway:writescope - Genesys Cloud API v2 base URL (e.g.,
https://api.mypurecloud.com) - Node.js 18+ with TypeScript 5+
npm install express axios dotenv- A valid Genesys Cloud organization ID and LLM Gateway configuration ID
Authentication Setup
Genesys Cloud requires a bearer token for all API requests. The machine-to-machine flow exchanges a client ID and secret for an access token. Production systems must cache the token and refresh it before expiration. The following module handles token acquisition, TTL tracking, and exponential backoff for rate limits.
import axios, { AxiosResponse } from 'axios';
import * as dotenv from 'dotenv';
dotenv.config();
interface GenesysCredentials {
environment: string;
clientId: string;
clientSecret: string;
}
interface TokenResponse {
access_token: string;
expires_in: number;
}
const credentials: GenesysCredentials = {
environment: process.env.GENESYS_ENVIRONMENT || 'api.mypurecloud.com',
clientId: process.env.GENESYS_CLIENT_ID!,
clientSecret: process.env.GENESYS_CLIENT_SECRET!,
};
let cachedToken: string | null = null;
let tokenExpiry: number = 0;
async function fetchAccessToken(): Promise<string> {
if (cachedToken && Date.now() < tokenExpiry) {
return cachedToken;
}
const url = `https://${credentials.environment}/oauth/token`;
const auth = Buffer.from(`${credentials.clientId}:${credentials.clientSecret}`).toString('base64');
try {
const response: AxiosResponse<TokenResponse> = await axios.post(
url,
'grant_type=client_credentials',
{
headers: {
'Content-Type': 'application/x-www-form-urlencoded',
Authorization: `Basic ${auth}`,
},
// Retry on 429 with exponential backoff
maxRedirects: 0,
}
);
cachedToken = response.data.access_token;
tokenExpiry = Date.now() + (response.data.expires_in * 1000) - 5000; // Refresh 5s early
return cachedToken;
} catch (error: any) {
if (error.response?.status === 429) {
const retryAfter = parseInt(error.response.headers['retry-after'] || '1', 10);
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
return fetchAccessToken();
}
throw new Error(`OAuth token acquisition failed: ${error.message}`);
}
}
export { fetchAccessToken };
Implementation
Step 1: Configure the SSE endpoint and HTTP headers
Server-Sent Events require specific response headers to disable caching and enable streaming. The EventSource API in browsers expects text/event-stream content type. You must also set Cache-Control: no-cache and Connection: keep-alive to prevent proxy or CDN buffering.
import { Request, Response } from 'express';
function initializeSseHeaders(res: Response): void {
res.writeHead(200, {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
'Access-Control-Allow-Origin': '*',
'X-Accel-Buffering': 'no', // Disables Nginx buffering
});
res.flushHeaders();
}
Step 2: Call Genesys LLM Gateway with streaming enabled
The LLM Gateway accepts a POST request to /api/v2/ai/llm/gateway/chat. You must set stream: true in the request body to receive incremental chunks instead of a single completed response. The response body is a newline-delimited JSON stream. Each line contains a JSON object with token data.
import axios, { AxiosResponse } from 'axios';
interface LlmGatewayRequest {
configurationId: string;
input: { messages: Array<{ role: string; content: string }> };
stream: true;
}
async function streamLlmGateway(
token: string,
environment: string,
payload: LlmGatewayRequest
): Promise<AsyncIterable<string>> {
const url = `https://${environment}/api/v2/ai/llm/gateway/chat`;
const response: AxiosResponse<any> = await axios.post(
url,
payload,
{
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json',
},
responseType: 'stream',
}
);
return response.data;
}
Step 3: Parse incremental chunks and format as JSON Lines
The raw stream from Genesys Cloud may contain multiple JSON objects per line or split across lines. You must buffer the stream, split by newline, parse each JSON object, and forward it using the SSE data: prefix. Each SSE message must end with two newline characters to signal completion to the client.
import { Readable } from 'stream';
function parseAndForwardSse(
stream: AsyncIterable<string>,
res: Response,
onDone: () => void
): void {
let buffer = '';
(async () => {
try {
for await (const chunk of stream) {
buffer += chunk;
const lines = buffer.split('\n');
buffer = lines.pop() || ''; // Keep incomplete line in buffer
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed) continue;
try {
const parsed = JSON.parse(trimmed);
// Handle Genesys termination signal
if (parsed.type === 'stop' || parsed.type === 'done') {
res.write('data: {"type":"stop"}\n\n');
res.end();
onDone();
return;
}
// Format as JSON Lines for SSE
const ssePayload = {
type: parsed.type || 'token',
content: parsed.content || parsed.delta || '',
index: parsed.index,
};
res.write(`data: ${JSON.stringify(ssePayload)}\n\n`);
} catch (parseError) {
// Ignore malformed partial JSON
console.warn('SSE parse warning:', parseError);
}
}
}
// Handle stream completion without explicit stop signal
if (buffer.trim()) {
try {
const parsed = JSON.parse(buffer.trim());
res.write(`data: ${JSON.stringify(parsed)}\n\n`);
} catch {}
}
res.end();
onDone();
} catch (error: any) {
console.error('Gateway stream error:', error);
res.write(`data: {"type":"error","message":"${error.message}"}\n\n`);
res.end();
onDone();
}
})();
}
Step 4: Handle stream termination and connection cleanup
Browser clients may disconnect unexpectedly. You must attach an AbortController to the HTTP request and listen for the close event on the response object. If the client drops the connection, you must abort the Genesys API request to prevent orphaned server-side streams and resource leaks.
import { AbortController } from 'node:abort-controller';
function setupConnectionCleanup(res: Response, abortController: AbortController): void {
res.on('close', () => {
console.log('Client disconnected. Aborting gateway stream.');
abortController.abort();
});
res.on('error', (err) => {
console.error('SSE connection error:', err);
abortController.abort();
});
}
Complete Working Example
The following Express application integrates authentication, streaming, SSE formatting, and connection cleanup into a single runnable module. Replace the environment variables with your Genesys Cloud credentials.
import express, { Request, Response } from 'express';
import { fetchAccessToken } from './auth';
import { streamLlmGateway } from './gateway';
import { parseAndForwardSse } from './sse';
import { setupConnectionCleanup } from './cleanup';
const app = express();
app.use(express.json());
app.post('/api/llm/stream', async (req: Request, res: Response) => {
const { configurationId, messages } = req.body;
if (!configurationId || !messages?.length) {
res.status(400).json({ error: 'configurationId and messages are required' });
return;
}
const abortController = new AbortController();
setupConnectionCleanup(res, abortController);
initializeSseHeaders(res);
try {
const token = await fetchAccessToken();
const stream = await streamLlmGateway(token, 'api.mypurecloud.com', {
configurationId,
input: { messages },
stream: true,
});
parseAndForwardSse(stream, res, () => {
console.log('Stream finalized successfully.');
});
} catch (error: any) {
console.error('Setup failed:', error);
res.write(`data: {"type":"error","message":"${error.message}"}\n\n`);
res.end();
}
});
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
console.log(`SSE proxy listening on port ${PORT}`);
});
Client-side integration uses the native EventSource API. The following JavaScript snippet demonstrates how to consume the stream and render incremental tokens to the DOM.
const eventSource = new EventSource('/api/llm/stream', {
withCredentials: true
});
const outputElement = document.getElementById('ai-response');
let buffer = '';
eventSource.onmessage = (event) => {
try {
const payload = JSON.parse(event.data);
if (payload.type === 'stop') {
eventSource.close();
console.log('Conversation turn complete.');
return;
}
if (payload.type === 'error') {
eventSource.close();
console.error('Gateway error:', payload.message);
return;
}
buffer += payload.content;
outputElement.textContent = buffer;
} catch (e) {
console.warn('SSE parse error:', e);
}
};
eventSource.onerror = (err) => {
console.error('EventSource connection failed:', err);
eventSource.close();
};
Common Errors & Debugging
Error: 401 Unauthorized
- What causes it: The OAuth token is expired, malformed, or the client credentials are incorrect.
- How to fix it: Verify
GENESYS_CLIENT_IDandGENESYS_CLIENT_SECRETmatch a Machine-to-Machine client in the Admin console. Ensure the token cache logic subtracts a buffer from theexpires_invalue. - Code showing the fix: The
fetchAccessTokenfunction already implements TTL tracking and automatic refresh. Add explicit logging toconsole.log('Token expires at:', new Date(tokenExpiry).toISOString())to verify cache behavior.
Error: 429 Too Many Requests
- What causes it: The Genesys Cloud API enforces rate limits per organization and per client. Streaming endpoints count against the same quota as batch endpoints.
- How to fix it: Implement exponential backoff on the initial POST request. The
axiosconfiguration instreamLlmGatewaydoes not include retry logic by default because stream responses cannot be retried mid-flight. Add a pre-call check or queue requests to stay within limits. - Code showing the fix: Wrap the
streamLlmGatewaycall with a retry decorator for the initial handshake, but never retry after the stream begins.
Error: 502 Bad Gateway or 504 Gateway Timeout
- What causes it: Reverse proxies like Nginx or AWS ALB buffer streaming responses by default. When the buffer timeout expires before the LLM finishes generating, the proxy closes the connection.
- How to fix it: Set
X-Accel-Buffering: noin the response headers and configure your proxy to disable buffering for/api/llm/stream. In Nginx, addproxy_buffering off;to the location block. - Code showing the fix: The
initializeSseHeadersfunction already includesX-Accel-Buffering: no. Verify your load balancer configuration matches.
Error: Client disconnects mid-stream
- What causes it: Browser navigation, network loss, or manual page refresh terminates the
EventSourceconnection. - How to fix it: Attach a
closelistener to the Express response object and callabortController.abort()to terminate the upstream Genesys request immediately. - Code showing the fix: The
setupConnectionCleanupfunction bindsres.on('close')toabortController.abort(), ensuring no orphaned API calls remain active.