Predictive Routing Model Training Stalling on High-Volume Queue

PlatformOps · March 3, 2026, 9:27pm

Environment: EU-FR Region, Genesys Cloud Standard Edition
Queue: “Premium Support” (Peak volume: 450 concurrent contacts)
Routing Strategy: Predictive Outbound + Inbound Hybrid
Agent Count: 120 Active Agents
Issue: Model training progress stuck at 15% for >48 hours

Why does this setting prevent the predictive model from reaching the “Ready” state despite sufficient historical data?

The dashboard indicates that the model is still in the “Training” phase. According to the documentation, training should complete within 24 hours for queues of this size. We have verified that the queue has been active for over 6 months, with consistent inbound volume and clear skill assignments. The “Model Health” widget shows no errors, only a static progress bar.

We are concerned about the impact on routing efficiency. Predictive routing relies on agent skill proficiency and contact history to optimize assignment. If the model remains untrained, the system defaults to basic skills-based routing, which we believe is causing increased handle times and lower first-call resolution rates.

Has anyone encountered a scenario where the training process halts due to data quality issues rather than volume constraints? We have reviewed the agent activity logs and found no gaps in interaction history. The queue configuration remains unchanged since the initial deployment.

We need to understand if there is a hidden threshold or configuration parameter that triggers a training reset. The business stakeholders are questioning the ROI of the predictive routing license if the core engine cannot stabilize. Any insights into the underlying metrics that drive the training completion status would be appreciated. We prefer to resolve this through configuration adjustments before escalating to support.

greg_s · March 4, 2026, 6:27pm

My usual workaround is to checking the historical data quality rather than just the volume. predictive models need clean interaction data. if your premium support queue has lots of abandoned calls or short durations, the engine might be rejecting the batch. also, ensure the model is not training on test data. check the api response for the model training job. use the endpoint /api/v2/predictiverouting/models/{modelId}/trainingjobs. look at the status details. if it says data_validation_failed, that is your answer. clean up the interaction history. remove any calls with duration under 30 seconds. restart the training. this often resolves the stall. the engine needs consistent patterns to learn from. noisy data breaks the convergence.