Discord Webhook Rate Limits Explained (429, Retry-After, Best Practices)
Complete guide to Discord webhook rate limits. Learn how to read 429 responses, parse Retry-After headers, implement exponential backoff, and avoid global bans.
If your Discord webhook starts returning 429 Too Many Requests, it’s not a bug — it’s Discord telling you to slow down. Get this wrong and your bot gets a global ban (10 minutes of silence across all webhooks tied to your IP). Get it right and you can push thousands of messages per minute reliably.
This guide covers exactly how Discord’s rate limits work for webhooks in 2026, the headers you must read, and the retry strategy that survives bursts and sustained load.
TL;DR — The Numbers You Need
- Per-webhook limit: ~30 requests per 60 seconds (per webhook URL)
- Per-channel limit: 5 requests per 5 seconds (shared across all webhooks in the same channel)
- Global limit: 50 requests per second (per IP / token)
- On 429: read
Retry-Afterand wait that many seconds before retrying - On
X-RateLimit-Global: true: stop all requests for the cooldown — not just the one that failed - Cloudflare ban: more than ~10,000 invalid requests in 10 minutes → IP blocked for 1 hour
Always parse the response headers. Hardcoding sleeps is fragile.
How Rate Limit Headers Work
Every webhook response includes these headers:
X-RateLimit-Limit: 5
X-RateLimit-Remaining: 4
X-RateLimit-Reset: 1714499200.123
X-RateLimit-Reset-After: 1.234
X-RateLimit-Bucket: 80c17d2f203122d936070c88c8d10f33
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Total requests allowed in this bucket |
X-RateLimit-Remaining | Requests left before hitting the limit |
X-RateLimit-Reset | Unix timestamp (seconds) when the bucket refills |
X-RateLimit-Reset-After | Seconds until refill (preferred — clock-skew-safe) |
X-RateLimit-Bucket | Bucket hash — group requests by this, not by route |
Use X-RateLimit-Reset-After, not X-RateLimit-Reset. The former is computed by Discord and immune to clock drift on your machine.
The 429 Response Body
When you exceed a limit, you get HTTP 429 with a JSON body:
{
"message": "You are being rate limited.",
"retry_after": 0.523,
"global": false,
"code": 0
}
retry_afteris in seconds as a float (millisecond precision since 2020)global: truemeans the global 50/s limit was hit — back off everythingcode: 30007means Cloudflare-level ban — you’re sending way too many invalid requests
Minimal Safe Sender (Python)
import time
import requests
WEBHOOK_URL = "https://discord.com/api/webhooks/ID/TOKEN"
def send(content: str, max_retries: int = 5):
payload = {"content": content}
for attempt in range(max_retries):
r = requests.post(WEBHOOK_URL, json=payload, timeout=10)
# Success
if r.status_code in (200, 204):
remaining = r.headers.get("X-RateLimit-Remaining")
reset_after = r.headers.get("X-RateLimit-Reset-After")
# Proactively pause if we're about to hit the limit
if remaining == "0" and reset_after:
time.sleep(float(reset_after) + 0.05)
return True
# Rate limited
if r.status_code == 429:
data = r.json()
wait = float(data.get("retry_after", 1))
is_global = data.get("global", False)
print(f"429 — waiting {wait:.2f}s (global={is_global})")
time.sleep(wait + 0.05) # tiny buffer for jitter
continue
# Server error → exponential backoff
if 500 <= r.status_code < 600:
backoff = (2 ** attempt) + 0.5
time.sleep(backoff)
continue
# Bad request — no point retrying
r.raise_for_status()
return False
send("Production deploy completed")
Key behaviors:
- Honors
retry_afterfrom the 429 body - Adds 50ms buffer to avoid edge-case re-failures
- Pauses proactively when
X-RateLimit-Remaininghits 0 - Falls back to exponential backoff on 5xx errors
Per-Channel vs Per-Webhook Buckets
This trips up most people. Two different webhooks pointing to the same channel share a per-channel rate limit. So if you have:
- Webhook A →
#alerts(used by CI) - Webhook B →
#alerts(used by monitoring)
…and both burst-send simultaneously, you’ll hit 5 requests / 5 seconds faster than you’d expect.
Solution: route different message classes to different channels, or queue and serialize sends through a single worker per channel.
Exponential Backoff with Jitter (JavaScript)
For high-throughput systems, add jitter so multiple workers don’t retry in lockstep:
async function send(payload, attempt = 0) {
const res = await fetch(process.env.WEBHOOK_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
});
if (res.ok) return res;
if (res.status === 429) {
const body = await res.json();
const wait = (body.retry_after + Math.random() * 0.1) * 1000;
await new Promise(r => setTimeout(r, wait));
return send(payload, attempt + 1);
}
if (res.status >= 500 && attempt < 5) {
const backoff = (2 ** attempt + Math.random()) * 1000;
await new Promise(r => setTimeout(r, backoff));
return send(payload, attempt + 1);
}
throw new Error(`Webhook failed: ${res.status}`);
}
Adding Math.random() to the wait time prevents the thundering herd when many clients retry the same 429.
Token Bucket — The Production Pattern
For sustained sending, implement a token bucket that locally enforces the limit:
import time
import threading
from collections import deque
class WebhookLimiter:
def __init__(self, max_per_window: int = 30, window_s: float = 60.0):
self.max = max_per_window
self.window = window_s
self.timestamps: deque[float] = deque()
self.lock = threading.Lock()
def acquire(self):
with self.lock:
now = time.monotonic()
# Drop expired timestamps
while self.timestamps and now - self.timestamps[0] > self.window:
self.timestamps.popleft()
if len(self.timestamps) >= self.max:
wait = self.window - (now - self.timestamps[0]) + 0.01
time.sleep(wait)
return self.acquire()
self.timestamps.append(now)
limiter = WebhookLimiter(max_per_window=25) # leave headroom
def send(content: str):
limiter.acquire()
requests.post(WEBHOOK_URL, json={"content": content})
This guarantees you never send more than 25/min, regardless of network latency, retries, or thread count.
Checklist: Avoid the Cloudflare Ban
- Always parse
Retry-After— never hardcode sleeps - Treat
global: trueas a full-stop, not just for the failed request - Validate payloads before sending (skip Discord rejection round-trips)
- Use a single queue per channel, not per webhook
- Log 429 frequency — if it’s > 1% of requests, your sender logic is wrong
- Cache the bucket from
X-RateLimit-Bucketif you need shared state across processes
Common Mistakes
Mistake 1: Treating retry_after as milliseconds. It’s seconds (since 2020). The header X-RateLimit-Reset-After is also seconds. Multiplying by 1000 means you wait way too long and look like you’re not retrying at all.
Mistake 2: Retrying 4xx errors. Only 429, 500, 502, 503, 504 are retryable. A 400 Bad Request means your payload is invalid — fix it. Hammering it accelerates the Cloudflare ban.
Mistake 3: Spawning a thread per message. Threads share the same IP and same rate limit pool. You don’t get parallelism for free — you get more 429s.
Mistake 4: Ignoring X-RateLimit-Remaining: 0. The next request will fail. Pause proactively.
Test It in Our Builder
Want to see exactly what payload your code is sending? Open the Discord Webhook builder, craft your embed, hit “Send”, and inspect the network panel — you’ll see the headers Discord returns and can validate your retry logic against real responses.
For higher-throughput automation, also check our guides on scheduled messages and automation workflows.
References
- Discord developer docs: Rate Limits
- Discord developer docs: Execute Webhook
Try it in our tool
Open Discord Webhook Builder