> ## Documentation Index > Fetch the complete documentation index at: https://usefoil.com/docs/llms.txt > Use this file to discover all available pages before exploring further. # Rate limits > Understand Foil API rate limits per organization and key type, read X-RateLimit response headers, handle 429 errors, and retry with Retry-After. Foil applies rate limits per organization and key type at the edge. Publishable keys for the same organization share one bucket, and secret keys for the same organization share another. Every authenticated response carries headers that describe current usage for that bucket, and rate-limited responses return `429 Too Many Requests` with a `Retry-After` directive. ## Default limits Organization defaults are applied unless Foil support has configured an organization-level override. | Key type | Default organization bucket | Typical use | | ----------------------------------------------- | --------------------------- | ---------------------------------------------------------------------------------- | | Publishable (`pk_*`) | `120` requests per window | Browser SDK traffic — session setup, batch streaming, sealed-handoff minting | | Secret (`sk_*`) | `600` requests per window | Server-side verification, session and fingerprint readback, management endpoints | | Gate workflow credentials (`gtpoll_*`, `agt_*`) | Scoped per session | Per-session polling caps; see [Gate Signup sessions](/api-reference/gate-sessions) | "Per window" here refers to a rolling bucket. Burst traffic that fits within the bucket is fine; sustained traffic above the per-second equivalent of the ceiling gets throttled. For high-volume integrations — fingerprint sweeps, scoring backfills, bulk audits — contact us to raise the ceiling for your organization. ## Response headers Every response from an authenticated endpoint includes: | Header | Value | | ----------------------- | ------------------------------------------------------------------------------------------- | | `X-RateLimit-Limit` | The ceiling for this organization and key type. | | `X-RateLimit-Remaining` | Requests left in the current window. Hits `0` on the request that would exceed the ceiling. | Rate-limited responses additionally include: | Header | Value | | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | | `Retry-After` | Seconds until the next acceptable retry. Always honor this — further requests during the window count against the limit and can extend cooldown. | ## 429 response shape Rate-limited responses use the standard [error envelope](/api-reference/errors): ```json theme={"dark"} { "error": { "code": "rate_limit.exceeded", "message": "Rate limit exceeded for this API key. Wait for the Retry-After window to expire before retrying.", "status": 429, "retryable": true, "request_id": "req_..." } } ``` `error.retryable` is `true` on every 429. The key thing the envelope doesn't tell you is *when* — that's in the `Retry-After` header. ## Retry strategy A correct retry for a 429 does three things: honor `Retry-After`, add jitter, and cap attempts. ```javascript Node.js theme={"dark"} async function requestWithRetry(url, options, maxAttempts = 4) { for (let attempt = 1; attempt <= maxAttempts; attempt++) { const response = await fetch(url, options); if (response.status !== 429) return response; const retryAfter = Number(response.headers.get("retry-after")) || 1; const jitterMs = Math.floor(Math.random() * 500); await new Promise((r) => setTimeout(r, retryAfter * 1000 + jitterMs)); } throw new Error("Rate-limited after retries"); } ``` ```python Python theme={"dark"} import time, random, requests def request_with_retry(method, url, max_attempts=4, **kwargs): for attempt in range(max_attempts): response = requests.request(method, url, **kwargs) if response.status_code != 429: return response retry_after = int(response.headers.get("retry-after", "1")) time.sleep(retry_after + random.uniform(0, 0.5)) raise RuntimeError("Rate-limited after retries") ``` ```go Go theme={"dark"} func requestWithRetry(client *http.Client, req *http.Request, maxAttempts int) (*http.Response, error) { for attempt := 0; attempt < maxAttempts; attempt++ { resp, err := client.Do(req) if err != nil { return nil, err } if resp.StatusCode != 429 { return resp, nil } retryAfter, _ := strconv.Atoi(resp.Header.Get("Retry-After")) if retryAfter < 1 { retryAfter = 1 } jitter := time.Duration(rand.Intn(500)) * time.Millisecond time.Sleep(time.Duration(retryAfter)*time.Second + jitter) } return nil, fmt.Errorf("rate-limited after retries") } ``` The server SDKs ship with this logic built in and honor `Retry-After` by default. You only need to write your own loop if you're calling the REST API with a custom HTTP client. ## What counts against the limit Every authenticated request counts, including ones that fail: * Successful 2xx responses ✓ * 4xx client errors (wrong field, not found) ✓ * 429 responses themselves ✓ * 5xx server errors ✓ * Preflight `OPTIONS` requests ✗ (not counted) In particular, a burst of 422 validation errors during development can eat into your window the same way as a burst of successful calls. If your integration is hammering the API in a loop because of a bug, fix the bug rather than raising the limit. ## Reducing your request rate If you're brushing up against the ceiling, the usual culprits and their fixes: * **Per-request session verification.** If you're calling `GET /v1/sessions/:sessionId` on every authenticated API hit, you're doing too much work. Cache the verified sealed token locally (the token already carries the verdict), and only fall back to the durable session API for audits, escalations, or stale sessions. See [API abuse → Session reuse](/use-cases/api-abuse). * **Manual pagination.** Many integrations iterate lists by page in a `for` loop. Switch to the server SDK's [auto-pagination](/api-reference/pagination#auto-pagination-in-the-sdks) helper, which streams items and respects retries automatically. * **Unbatched management calls.** If you're provisioning many API keys or organizations, space the calls out with backoff rather than firing them concurrently. * **Polling that could be webhooks.** Gate session polling uses short-lived `gtpoll_*` tokens with their own caps. If you're polling for Gate approval in a tight loop, look at the recommended polling cadence documented on [Gate Signup sessions](/api-reference/gate-sessions). ## Raising a limit Production integrations with sustained, legitimate load can have organization-level limits raised above the defaults. Contact support with: * The organization you want raised * Whether the increase is for publishable keys, secret keys, or both * Your expected steady-state and peak RPS * The endpoints you need headroom on Publishable-key ceilings can generally go significantly higher than secret-key ceilings, since `pk_*` serves browser traffic. ## What's next The full error envelope, including 429 details. Auto-pagination and filters — the main way to reduce request count. Session reuse patterns that reduce server-side verification rate. Organization defaults and key lifecycle.