429 Too Many Requests on /api/v2/users bulk update - need backoff logic

routing_ranger · February 25, 2026, 6:30am

We’re trying to update 200 agents via the Genesys Cloud API using a simple Python loop with requests.put. It works for the first 50 or so, then we hit a wall. The server throws a 429 Too Many Requests error with a Retry-After header. We need to implement a proper exponential backoff in the code to handle this without crashing the script. Here’s what we have so far:

import requests
import time

headers = {"Authorization": f"Bearer {token}"}
for user in users_to_update:
 url = f"/api/v2/users/{user['id']}"
 response = requests.put(url, json=user['data'], headers=headers)
 if response.status_code == 429:
 time.sleep(1) # This is too dumb, it fails again
 response = requests.put(url, json=user['data'], headers=headers)
 print(response.status_code)

The Retry-After header usually says 1 or 2 seconds. We can’t just sleep for a fixed amount because the load varies. How do we parse the Retry-After header and implement a clean backoff loop? We’ve seen examples with tenacity but aren’t sure if it’s overkill. Just need the right pattern to parse the header and retry with increasing delays. Don’t want to block the whole queue for 10 seconds if only one call fails. Any quick code snippet to handle this properly?

JsonJester90 · February 25, 2026, 7:40am

You’re hitting the rate limiter because Genesys Cloud enforces strict per-second limits on bulk operations. The Retry-After header is your best friend here, but parsing it manually can get messy. Since I’m usually managing this kind of state via Terraform, I appreciate when APIs behave predictably. For Python scripts, you don’t need to reinvent the wheel. The requests library has a built-in session handler that respects these headers if you configure it right.

Here’s how I handle it in my utility scripts:

Use tenacity for retries: It’s cleaner than writing raw sleep loops. It handles the exponential backoff logic automatically.
Check the Retry-After header: If the API sends one, use that exact value. Otherwise, fall back to a standard backoff.
Wrap in a session: Reusing the connection pool helps slightly with overhead, though the limit is mostly on the server side.

import requests
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
import time

# Define the retry strategy
@retry(
 stop=stop_after_attempt(5),
 wait=wait_exponential(multiplier=1, min=2, max=30),
 retry=retry_if_exception_type(requests.exceptions.HTTPError)
)
def update_agent(agent_id, payload, base_url, token):
 url = f"{base_url}/api/v2/users/{agent_id}"
 headers = {
 "Authorization": f"Bearer {token}",
 "Content-Type": "application/json"
 }
 
 response = requests.put(url, json=payload, headers=headers)
 
 # Manually check for 429 to trigger retry with custom delay if needed
 if response.status_code == 429:
 retry_after = int(response.headers.get('Retry-After', 5))
 time.sleep(retry_after)
 raise requests.exceptions.HTTPError(f"Rate limited. Waiting {retry_after}s")
 
 response.raise_for_status()
 return response.json()

# Usage example
# update_agent("user-123", {"name": "New Name"}, "https://api.mypurecloud.com", "your_token")

The key here is letting the library handle the jitter. If you hardcode sleeps, you’ll either wait too long or hit the wall again. Also, make sure you’re not parallelizing too aggressively. Even with backoff, firing 200 concurrent requests will melt the gateway. Stick to a small thread pool, like 5-10 workers, and let the retries do the heavy lifting.