Designing Capacity Planning Models for WebRTC Media Servers under Peak Holiday Load
What This Guide Covers
You are building a capacity planning model for Genesys Cloud’s WebRTC media infrastructure-specifically determining how many concurrent WebRTC agent sessions your media tier can support before audio quality degrades, and automating pre-season capacity expansion for predictable peak events like Black Friday, holiday retail spikes, or end-of-fiscal-year financial services surges. When complete, your model will combine historical concurrent session data from the Genesys Cloud Analytics API with infrastructure telemetry to forecast the precise headroom needed, and will automate the pre-provisioning of additional BYOC Cloud Edge capacity or Genesys-managed media resources weeks before the peak event.
Prerequisites, Roles & Licensing
- Genesys Cloud: Any CX tier.
- Permissions required:
Analytics > Queue Aggregates > ViewTelephony > Edges > View(for BYOC Premise environments)
- Infrastructure:
- Time-series database (InfluxDB, CloudWatch Metrics, or Prometheus) for storing concurrent session telemetry.
- A capacity modeling script (Python with
pandasandscipy).
The Implementation Deep-Dive
1. The WebRTC Capacity Constraint
Each WebRTC session (a browser-based agent connected to Genesys Cloud) consumes:
- Bandwidth: 64-128 kbps bidirectional (Opus codec at 48kHz).
- Media server CPU: Approximately 0.5% CPU per concurrent session on a reference media server.
- Port allocation: 1 UDP media port pair per session.
For Genesys Cloud, the managed media infrastructure scales automatically for most load levels. However, if you are using BYOC Premise Edges with on-site hardware handling WebRTC media, capacity is fixed by your hardware. A 4-core Edge appliance typically supports 200-250 concurrent WebRTC sessions before audio jitter becomes noticeable.
2. Extracting Historical Concurrent Session Data
import requests
from datetime import datetime, timedelta
import pandas as pd
GENESYS_API = "https://api.mypurecloud.com"
def extract_peak_concurrency_by_week(
queue_ids: list[str],
lookback_weeks: int = 12,
access_token: str = ""
) -> pd.DataFrame:
"""
Extracts the peak concurrent active interactions per 15-min interval
over the last N weeks. Used to build the capacity model baseline.
"""
headers = {"Authorization": f"Bearer {access_token}", "Content-Type": "application/json"}
records = []
for week_offset in range(lookback_weeks):
start = datetime.utcnow() - timedelta(weeks=week_offset + 1)
end = datetime.utcnow() - timedelta(weeks=week_offset)
payload = {
"interval": f"{start.isoformat()}Z/{end.isoformat()}Z",
"granularity": "PT15M",
"groupBy": ["queueId"],
"filter": {
"type": "orFilter",
"filters": [{"type": "term", "dimension": "queueId", "value": qid} for qid in queue_ids]
},
"metrics": ["tHandle", "nAnswered"]
}
resp = requests.post(
f"{GENESYS_API}/api/v2/analytics/queues/aggregates/query",
headers=headers, json=payload
)
for result in resp.json().get("results", []):
for data_point in result.get("data", []):
if data_point["metric"] == "nAnswered":
records.append({
"interval": data_point.get("interval"),
"answered": data_point.get("stats", {}).get("count", 0),
"week_of": start.isocalendar()[1]
})
return pd.DataFrame(records)
3. Building the Capacity Model
from scipy import stats
import numpy as np
def build_capacity_model(df: pd.DataFrame, aht_seconds: float = 300) -> dict:
"""
Builds a capacity model from historical data.
Uses Erlang C theory to estimate concurrent sessions from volume + AHT.
Args:
df: DataFrame with 'interval' and 'answered' columns.
aht_seconds: Average Handle Time in seconds.
Returns:
capacity_model: Dict with peak estimates and growth projections.
"""
df['datetime'] = pd.to_datetime(df['interval'].str.split('/').str[0])
df['hour'] = df['datetime'].dt.hour
df['dayofweek'] = df['datetime'].dt.dayofweek
# Find the historical peak concurrent session estimate
# Concurrent ≈ Offered Rate × AHT (Little's Law)
# 15-min interval: offered rate = answered / (15 * 60 seconds) interactions/sec
df['concurrent_estimate'] = df['answered'] * (aht_seconds / 900) # 900 = 15 min in seconds
# Get the P95 peak (not absolute max, which might be a data anomaly)
peak_p95 = df['concurrent_estimate'].quantile(0.95)
peak_max = df['concurrent_estimate'].max()
# YoY growth: compare last 4 weeks to same 4 weeks a year ago (simplified)
recent_peak = df.tail(4 * 7 * 24 * 4)['concurrent_estimate'].quantile(0.95) # Last 4 weeks
yoy_growth_rate = 0.15 # Assume 15% YoY if historical data < 1 year
return {
"current_peak_p95_concurrent": round(peak_p95, 1),
"current_peak_max_concurrent": round(peak_max, 1),
"projected_holiday_peak": round(peak_p95 * 1.5, 1), # Assume 50% above P95 for Black Friday
"recommended_capacity": round(peak_p95 * 1.5 * 1.20, 1), # +20% headroom over projected peak
"yoy_growth_pct": yoy_growth_rate * 100,
"headroom_at_current_capacity": None # Filled in with hardware limit
}
4. Automating Pre-Season Capacity Expansion
For managed Genesys Cloud environments, capacity scales automatically. Document the model output and verify with Genesys Cloud Support that your projected peak is within their standard elasticity guarantees.
For BYOC Premise environments with fixed hardware:
- Calculate the projected peak concurrent sessions from the model.
- If
projected_holiday_peak > edge_hardware_capacity * 0.80(80% threshold), trigger a provisioning order. - Options: Temporarily add BYOC Cloud edges as overflow, or provision additional hardware appliances 8 weeks before peak.
def evaluate_byoc_headroom(
model: dict,
edge_hardware_capacity: int, # Max concurrent sessions per Edge pair
num_edge_pairs: int
) -> dict:
total_capacity = edge_hardware_capacity * num_edge_pairs
projected = model["projected_holiday_peak"]
headroom_pct = (total_capacity - projected) / total_capacity * 100
recommendation = "ADEQUATE"
if headroom_pct < 20:
recommendation = "WARNING - Consider BYOC Cloud overflow"
if headroom_pct < 5:
recommendation = "CRITICAL - Immediate capacity expansion required"
return {
"total_capacity": total_capacity,
"projected_peak": projected,
"headroom_pct": round(headroom_pct, 1),
"recommendation": recommendation
}
Validation, Edge Cases & Troubleshooting
Edge Case 1: Holiday Traffic Profile Differs from Baseline
Black Friday calls are typically shorter (agents are trained for speed) but arrive in a sharper spike (8 AM rush) vs. a typical day’s gradual ramp. The P95 of a normal week doesn’t capture this spike shape.
Solution: Segment your historical data by event type. Pull the actual interval-level data from the previous Black Friday specifically (not 12-week average P95) and use that as the holiday profile baseline. If no previous Black Friday data exists, use a safety factor of 2.0× P95 instead of 1.5×.
Edge Case 2: BYOC Edge CPU Saturation Before Port Exhaustion
On lightly resourced Edge hardware, CPU may become the binding constraint before UDP port capacity. A media server at 90% CPU will start dropping audio packets, causing jitter-even if port capacity is not exceeded.
Solution: Include CPU telemetry in your capacity model alongside concurrent session counts. Use Prometheus + Node Exporter (or Genesys Edge monitoring APIs) to track CPU utilization per session. If the measured CPU-per-session ratio is higher than expected (e.g., due to wideband audio or video), adjust your concurrent session capacity estimate downward accordingly.
Edge Case 3: Agents Using High-Quality Codecs Increasing Bandwidth
If your agents use HD audio (720p video for screen sharing, or Opus at 48kHz instead of 8kHz for voice), bandwidth per session can be 4× higher, straining your WAN and media server simultaneously.
Solution: Audit the codec settings in your Genesys Cloud Phone policies. For high-volume capacity planning, standardize on narrowband voice (G.711/Opus at 8kHz) for voice-only interactions. Reserve HD audio for manager-level video conferencing use cases where session counts are low.