Implementing Multi-Modal Bot Interactions (Voice + Digital) within a Single Interaction Lifecycle
What This Guide Covers
You are building a unified bot orchestration pattern where a single customer interaction can fluidly move between voice (IVR/Dialogflow/Lex) and digital channels (web messaging, SMS) without losing context, session state, or conversation history. When this is working, a customer who starts by calling, gets deflected to SMS for verification, and then escalates back to a voice agent is handled as one continuous interaction - one conversation record, one wrap-up, one CSAT survey, and full context available to the live agent before they accept the transfer.
Prerequisites, Roles & Licensing
Genesys Cloud
- Licensing: Genesys Cloud CX 2 or CX 3 with Genesys Dialog Engine Bot Flows or a third-party bot integration (Dialogflow CX, Amazon Lex, Nuance)
- Additional: Digital Channels entitlement (for web messaging / SMS), Agentless SMS license for outbound SMS deflection
- Permissions required:
Architect > Flow > Add/Edit(for both Inbound Call and Inbound Message flows)Routing > Queue > EditIntegrations > Integration > View/Edit(for bot connector configs)Conversations > Communication > Create(for Agentless Messaging API)
- OAuth scopes:
conversations:write,conversations:read,messaging:read
NICE CXone
- Licensing: CXone Digital First Omnichannel + Virtual Agent Hub entitlement
- Bot integrations: CXone Studio with
VOICEBOT BEGIN / VOICEBOT ENDandDIGITALBOT BEGIN / DIGITALBOT ENDactions - External dependency: A shared bot platform (Nuance Mix, Google CCAI) with both voice and digital channels configured as entry points
The Implementation Deep-Dive
1. Understanding the Session Continuity Problem
The fundamental challenge in multi-modal bot design is that voice and digital channels are architecturally separate systems. In Genesys Cloud, an ACD voice call and a web messaging conversation are different interaction types, with different conversationId values, different participant event streams, and different wrap-up code workflows.
When you deflect a customer from a call to SMS, you are not “converting” one interaction - you are spawning a second interaction. Unless you explicitly stitch them together via shared state, the agent receiving the escalated digital interaction starts blind.
The three session continuity mechanisms available are:
- Participant Data propagation - writing context as attributes on both interactions, keyed by a shared correlation ID
- External session state (Redis, DynamoDB, or a Data Action-accessible API) - a short-lived cache that both bot flows read and write
- Genesys Cloud Conversation Notes / Wrapup injection - less reliable for real-time handoffs, better for post-call summaries
For production multi-modal deployments, use mechanisms 1 and 2 in combination.
2. Building the Voice-to-Digital Deflection Gateway
The most common multi-modal trigger is a voice IVR that offers the customer an SMS deflection option: “I can send you a link to complete this on your phone instead of waiting on hold. Reply YES to continue.”
Step 1: In the Inbound Call Architect flow, generate a correlation ID and send the SMS
[Action: Call Data Action]
Integration: AWS Lambda / Internal Correlation Service
Input: Flow.ANI, Flow.ConversationId
Output: Flow.CorrelationId (e.g., "SESS-20250314-abc123")
[Action: Send Agentless SMS]
From Number: +15055551000 (your SMS-enabled number)
To Number: Flow.ANI
Message: "Hi, this is Brand Support. Click here to continue: https://support.brand.com/chat?session={Flow.CorrelationId}"
Step 2: Set participant data on the voice interaction
[Action: Set Participant Data]
Attribute: correlationId
Value: Flow.CorrelationId
[Action: Set Participant Data]
Attribute: deflectionChannel
Value: "sms"
[Action: Set Participant Data]
Attribute: deflectionTimestamp
Value: Flow.DateTimeNow
Then park the voice call (route to a low-priority holding queue or disconnect with a “We’ll see you in the app” message, depending on your business requirement).
The Trap - using ANI as the correlation key: Some implementations skip the correlation ID and use ANI directly as the session key. This breaks for spoofed numbers, number recycling, and customers calling from work phones. Always generate a cryptographically random, short-lived correlation token. The token also lets you bind the session to a specific ANI without trusting the ANI presented in the digital channel.
3. Receiving the Digital Interaction and Restoring Context
When the customer clicks the SMS link and opens the web messaging widget, the page loads the Genesys Messenger snippet and passes the correlation ID as a custom attribute:
// On the landing page: ?session=SESS-20250314-abc123
const sessionId = new URLSearchParams(window.location.search).get("session");
Genesys("command", "Database.set", {
messaging: {
customAttributes: {
correlationId: sessionId,
originChannel: "voice_deflection"
}
}
});
In the Inbound Message Architect flow:
[Action: Get Participant Data]
Attribute: correlationId
Store in: Flow.CorrelationId
[Decision: Is Flow.CorrelationId NOT empty?]
|--- YES --> [Call Data Action: Fetch Voice Session Context]
|--- NO --> [Standard new-customer flow]
The Data Action calls your session cache (Redis, DynamoDB, or a custom API) with the correlationId and retrieves the full voice context:
{
"correlationId": "SESS-20250314-abc123",
"originalAni": "+15055551234",
"voiceConversationId": "conv-voice-uuid-001",
"intentAtDeflection": "billing_dispute",
"authenticationStatus": "verified",
"accountId": "ACC-98765",
"createdAt": "2025-03-14T14:22:10Z",
"ttlSeconds": 3600
}
Write all of this back as participant data on the digital interaction:
[Action: Set Participant Data]
Attribute: originVoiceConversationId
Value: Flow.VoiceContext.voiceConversationId
[Action: Set Participant Data]
Attribute: authenticationStatus
Value: Flow.VoiceContext.authenticationStatus
[Action: Set Participant Data]
Attribute: priorIntent
Value: Flow.VoiceContext.intentAtDeflection
The bot can now greet the customer by name, skip re-authentication, and resume the correct intent branch - without asking a single question.
The Trap - TTL too short, breaking returning customers: If the customer doesn’t click the SMS link immediately (they’re driving, they finish a meeting), a 5-minute TTL will expire before they engage. For deflection sessions, use a 60-90 minute TTL. For session data that contains verified PII (account ID, SSN last 4), ensure the cache is encrypted at rest and purge on first use if the data is sensitive.
4. Bot Handoff and Context Enrichment at Agent Transfer
When the digital bot escalates to a live agent, the context assembled from both interactions is available as participant data. Ensure your agent-facing Agent Assist or Screen Pop integration reads these attributes.
Wrapping voice context into the agent transcript:
Before the Transfer to Queue action, inject a synthetic bot turn into the conversation to summarize the cross-channel journey for the agent:
[Action: Set Participant Data]
Attribute: agentContextSummary
Value: "Customer deflected from voice (call {Flow.OriginVoiceConversationId}) at 14:22. Intent: billing_dispute. Auth: verified. Account: ACC-98765. Sentiment at deflection: frustrated."
This surfaces in the Genesys Cloud Agent Desktop in the conversation’s Customer Details panel (via a custom attribute widget) or via an Interaction Widget iframe that calls GET /api/v2/conversations/{id} and renders the attribute as a context card.
NICE CXone multi-modal context handoff:
In CXone Studio, the SET action writes agent-visible notes before the REQAGENT action:
SET ~CustomerContext "Origin: Voice call deflected to digital. Account: {AccountId}. Intent: {DetectedIntent}. Auth: {AuthStatus}"
REQAGENT ~CustomerContext
The ~CustomerContext variable populates the Custom Data panel in the MAX agent desktop.
5. Handling the Re-Escalation to Voice
If the digital interaction cannot resolve the issue and the customer requests to speak to someone, you must bridge back to voice without losing context. Two patterns exist:
Pattern A: Scheduled Callback (Recommended)
Trigger a Genesys Cloud callback using the Conversations API, pre-populating the callback with the digital interaction’s participant data:
POST /api/v2/conversations/callbacks
Authorization: Bearer {access_token}
Content-Type: application/json
{
"routingData": {
"queueId": "queue-uuid-billing-voice",
"priority": 10
},
"callbackNumbers": ["+15055551234"],
"callbackUserName": "Verified Customer",
"data": {
"correlationId": "SESS-20250314-abc123",
"priorDigitalConversationId": "conv-digital-uuid-001",
"resolvedIntent": "billing_dispute",
"authenticationStatus": "verified"
}
}
The callback conversation inherits the data fields as participant attributes. The agent receiving the callback sees the full context.
Pattern B: Warm Transfer via Agentless Voice Initiation
Less common, requires outbound dialing rights. The digital bot triggers an outbound call to the customer while simultaneously bridging to an available queue agent - a “bot-initiated voice conference.” This is complex to orchestrate and introduces TCPA compliance considerations for outbound dialing to US numbers. Pattern A is preferred for regulated environments.
The Trap - not linking the two conversation IDs in your reporting:
The two conversation IDs (voiceConversationId and digitalConversationId) are separate rows in your analytics tables. Without a linking key (the correlationId), your AHT and FCR metrics will be distorted - the voice interaction shows as handled (deflected), and the digital interaction shows as a new contact, making FCR calculations incorrect. Build a reporting layer that joins on correlationId to reconstruct the full journey as a single customer episode.
Validation, Edge Cases & Troubleshooting
Edge Case 1: Customer Opens the SMS Link Multiple Times
If the customer opens the SMS link twice (on different devices, or after a refresh), two digital conversations are created against the same correlationId. The session cache should track whether the session is ACTIVE or CONSUMED. On second claim, either redirect the customer to the existing active conversation or reject with an explanatory message. Do not allow two concurrent digital interactions to map to the same correlation session - agents will receive two transfer requests for the same customer.
Edge Case 2: Bot Platform Failover Between Channels
If you are using a hosted bot platform (Dialogflow CX), the voice channel and digital channel may use different Dialogflow agent integrations with different session namespaces. Ensure the session ID passed to Dialogflow incorporates your correlationId as a prefix so that intent history is accessible across both channels:
Dialogflow Session ID (voice): {correlationId}-voice
Dialogflow Session ID (digital): {correlationId}-digital
This allows the Dialogflow agent to recognize the session as a continuation when you call detectIntent with the prior context.
Edge Case 3: GDPR / CCPA - Correlation Data Retention
The session cache contains PII (ANI, account ID, authentication status). Under GDPR Article 17, this data must be erasable on request. Ensure your session cache TTL defaults to ≤ 24 hours, and your data deletion workflow (if you have a formal DSAR response process) includes a cache purge step keyed by ANI or customer ID.
Edge Case 4: Voice Call Disconnects Before SMS is Confirmed Received
If the carrier fails to deliver the SMS (incorrect ANI, carrier filtering), the customer is left with a disconnected call and no way to continue on digital. Implement a delivery receipt webhook from your SMS provider (Twilio status callbacks, Vonage DLR) that updates the session cache with sms_delivered: true/false. If the SMS fails, your session management service can trigger a retry or log the failure for manual follow-up.