How to architect a multi-language IVR with runtime language selection

We want a multi-language IVR, but I need it explained in business terms.

When a customer calls, they should hear ‘Press 1 for English, press 2 for Spanish, press 3 for French’ and then the entire IVR experience switches to that language. How does the dashboard show me which language had the most calls?

From a QM evaluator’s perspective, multi-language IVRs complicate the evaluation process.

If an agent handles a Spanish call but we evaluate them using our English-language evaluation form, the scoring rubric doesn’t align. You need separate evaluation forms per language, or at minimum, a bilingual QM supervisor who can evaluate both.

If you have screen recording enabled for QM, be aware that Chrome’s language detection can interfere.

Our screen recording extension captures the agent’s browser. When the agent switches to a Spanish-language CRM page, Chrome sometimes triggers an automatic page translation popup that obscures the screen recording. Disable Chrome’s auto-translate via chrome://settings/languages on all agent workstations.

If you are deploying this multi-language IVR on BYOC Premises Edges, each language’s TTS audio must be cached locally on the Edge appliance.

The Edge has limited storage. If you have 3 languages × 50 prompts × 2 versions (business hours / after hours) = 300 audio files, verify the Edge has sufficient disk space. Check the BIOS power settings too - some Edge hardware enters a low-power state during off-hours that throttles the audio rendering.

For a scalable approach, use AWS Polly for dynamic TTS instead of pre-recorded prompts.

Create a Lambda function that generates the audio on-the-fly based on the selected language. Pass the language code from Architect to the Lambda via a Data Action.

MultiLangTTSLambda:
  Type: AWS::Lambda::Function
  Properties:
    Handler: tts_handler.generate
    Runtime: python3.11
    Environment:
      Variables:
        POLLY_VOICE_MAP: '{"en": "Joanna", "es": "Lupe", "fr": "Lea"}'

I design IVRs for dozens of clients, and the key to a great multi-language experience is keeping the language selection menu as short as possible.

Don’t list 8 languages sequentially. Instead, use ANI-based geo-detection: if the caller’s number is from Mexico, default to Spanish and offer English as an escape option. This improves containment rates by 20% because the caller hears their native language immediately.

If you pair this multi-language IVR with Agent Assist, ensure your AI knowledge base is indexed per language.

The real-time transcription engine needs to know the conversation language before it can surface relevant articles. You must pass the language selection variable from Architect to the Agent Assist configuration, or the NLU will attempt English parsing on Spanish audio and return garbage suggestions.