We’re trying to update 200 agents via the Genesys Cloud API using a simple Python loop with requests.put. It works for the first 50 or so, then we hit a wall. The server throws a 429 Too Many Requests error with a Retry-After header. We need to implement a proper exponential backoff in the code to handle this without crashing the script. Here’s what we have so far:
import requests
import time
headers = {"Authorization": f"Bearer {token}"}
for user in users_to_update:
url = f"/api/v2/users/{user['id']}"
response = requests.put(url, json=user['data'], headers=headers)
if response.status_code == 429:
time.sleep(1) # This is too dumb, it fails again
response = requests.put(url, json=user['data'], headers=headers)
print(response.status_code)
The Retry-After header usually says 1 or 2 seconds. We can’t just sleep for a fixed amount because the load varies. How do we parse the Retry-After header and implement a clean backoff loop? We’ve seen examples with tenacity but aren’t sure if it’s overkill. Just need the right pattern to parse the header and retry with increasing delays. Don’t want to block the whole queue for 10 seconds if only one call fails. Any quick code snippet to handle this properly?
You’re hitting the rate limiter because Genesys Cloud enforces strict per-second limits on bulk operations. The Retry-After header is your best friend here, but parsing it manually can get messy. Since I’m usually managing this kind of state via Terraform, I appreciate when APIs behave predictably. For Python scripts, you don’t need to reinvent the wheel. The requests library has a built-in session handler that respects these headers if you configure it right.
Here’s how I handle it in my utility scripts:
- Use
tenacity for retries: It’s cleaner than writing raw sleep loops. It handles the exponential backoff logic automatically.
- Check the
Retry-After header: If the API sends one, use that exact value. Otherwise, fall back to a standard backoff.
- Wrap in a session: Reusing the connection pool helps slightly with overhead, though the limit is mostly on the server side.
import requests
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
import time
# Define the retry strategy
@retry(
stop=stop_after_attempt(5),
wait=wait_exponential(multiplier=1, min=2, max=30),
retry=retry_if_exception_type(requests.exceptions.HTTPError)
)
def update_agent(agent_id, payload, base_url, token):
url = f"{base_url}/api/v2/users/{agent_id}"
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
}
response = requests.put(url, json=payload, headers=headers)
# Manually check for 429 to trigger retry with custom delay if needed
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 5))
time.sleep(retry_after)
raise requests.exceptions.HTTPError(f"Rate limited. Waiting {retry_after}s")
response.raise_for_status()
return response.json()
# Usage example
# update_agent("user-123", {"name": "New Name"}, "https://api.mypurecloud.com", "your_token")
The key here is letting the library handle the jitter. If you hardcode sleeps, you’ll either wait too long or hit the wall again. Also, make sure you’re not parallelizing too aggressively. Even with backoff, firing 200 concurrent requests will melt the gateway. Stick to a small thread pool, like 5-10 workers, and let the retries do the heavy lifting.