Handling 429 on bulk user updates with backoff

CloudCamille · April 9, 2026, 4:18am

Is it possible to handle rate limits without melting my local compose stack? i’m running a test harness against /api/v2/users to bulk patch agent skills. hitting 429 Too Many Requests after about 12 calls.

Problem

requests.patch(url, headers=auth, json=payload)

Error

it’s dropping the connection on the mock server. looking for a clean retry loop with exponential backoff. don’t want to hardcode sleep values. terraform state files are getting messy anyway. the docker logs are already full.

Ahelex · April 9, 2026, 5:51am

This is caused by hitting the rate limit bucket for user updates. CXone enforces strict throttling on /api/v2/users, usually capping at ~200 requests per minute per org, but it’s tighter on PATCH operations. your python script is likely firing requests faster than the server can process them, triggering the 429. you need to implement an exponential backoff strategy. don’t just retry immediately. read the Retry-After header if it’s present, otherwise start with a small delay like 500ms and double it on each failure. here’s a quick node.js pattern using axios that you can adapt. it handles the jitter too so you don’t sync up with other clients hitting the limit.

const axios = require('axios');

async function patchUserWithBackoff(url, payload, retries = 5) {
 for (let i = 0; i < retries; i++) {
 try {
 const res = await axios.patch(url, payload, { headers: auth });
 return res.data;
 } catch (err) {
 if (err.response?.status === 429) {
 const retryAfter = err.response.headers['retry-after'] 
 ? parseInt(err.response.headers['retry-after'], 10) 
 : Math.pow(2, i) * 1000;
 console.log(`429 hit. waiting ${retryAfter}ms`);
 await new Promise(r => setTimeout(r, retryAfter));
 } else {
 throw err;
 }
 }
 }
}

also check if you’re batching these. sending individual patches for each skill change is inefficient. look into bulk update endpoints if available for your specific resource, or group changes into fewer, larger payloads. i’ve seen teams reduce api calls by 80% just by consolidating updates. make sure your auth token isn’t expiring mid-batch too, that causes 401s which look like failures but aren’t rate limits. keep an eye on the x-ratelimit headers in the response. they tell you exactly how many requests you have left.

RestRider · April 11, 2026, 5:51am

You need to stop blasting the API without respect for the headers. Genesys Cloud is strict about rate limits, and ignoring the Retry-After header is a fast track to getting your IP banned or your tokens revoked.

hitting 429 Too Many Requests after about 12 calls.

That’s not a bug, that’s a feature. The server is telling you to chill. If you’re using Python, requests doesn’t handle this automatically. You have to parse the response. Here’s a quick snippet using urllib3.util.retry which handles the backoff logic much better than a manual time.sleep() loop. It respects the Retry-After header automatically.

import requests
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter

session = requests.Session()
retries = Retry(
 total=5,
 backoff_factor=1,
 status_forcelist=[429, 500, 502, 503, 504]
)
session.mount('https://', HTTPAdapter(max_retries=retries))

# Use session.patch instead of requests.patch
response = session.patch(url, headers=auth, json=payload)

Stop writing your own sleep timers. Let the library handle the jitter.

ErikaMeow · April 14, 2026, 5:51am

TL;DR: respect the Retry-After header.

it depends, but generally you can’t just sleep for a fixed duration because the bucket refills vary by endpoint. i usually parse the header directly in python since requests doesn’t do it for you.

import time
if response.status_code == 429:
 delay = int(response.headers.get('Retry-After', 5))
 time.sleep(delay)

this keeps your compose stack from melting.