Skip to main content
Foil applies rate limits per organization and key type at the edge. Publishable keys for the same organization share one bucket, and secret keys for the same organization share another. Every authenticated response carries headers that describe current usage for that bucket, and rate-limited responses return 429 Too Many Requests with a Retry-After directive.

Default limits

Organization defaults are applied unless Foil support has configured an organization-level override.
Key typeDefault organization bucketTypical use
Publishable (pk_*)120 requests per windowBrowser SDK traffic — session setup, batch streaming, sealed-handoff minting
Secret (sk_*)600 requests per windowServer-side verification, session and fingerprint readback, management endpoints
Gate workflow credentials (gtpoll_*, agt_*)Scoped per sessionPer-session polling caps; see Gate Signup sessions
“Per window” here refers to a rolling bucket. Burst traffic that fits within the bucket is fine; sustained traffic above the per-second equivalent of the ceiling gets throttled. For high-volume integrations — fingerprint sweeps, scoring backfills, bulk audits — contact us to raise the ceiling for your organization.

Response headers

Every response from an authenticated endpoint includes:
HeaderValue
X-RateLimit-LimitThe ceiling for this organization and key type.
X-RateLimit-RemainingRequests left in the current window. Hits 0 on the request that would exceed the ceiling.
Rate-limited responses additionally include:
HeaderValue
Retry-AfterSeconds until the next acceptable retry. Always honor this — further requests during the window count against the limit and can extend cooldown.

429 response shape

Rate-limited responses use the standard error envelope:
{
  "error": {
    "code": "rate_limit.exceeded",
    "message": "Rate limit exceeded for this API key. Wait for the Retry-After window to expire before retrying.",
    "status": 429,
    "retryable": true,
    "request_id": "req_..."
  }
}
error.retryable is true on every 429. The key thing the envelope doesn’t tell you is when — that’s in the Retry-After header.

Retry strategy

A correct retry for a 429 does three things: honor Retry-After, add jitter, and cap attempts.
async function requestWithRetry(url, options, maxAttempts = 4) {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const response = await fetch(url, options);
    if (response.status !== 429) return response;

    const retryAfter = Number(response.headers.get("retry-after")) || 1;
    const jitterMs = Math.floor(Math.random() * 500);
    await new Promise((r) => setTimeout(r, retryAfter * 1000 + jitterMs));
  }
  throw new Error("Rate-limited after retries");
}
The server SDKs ship with this logic built in and honor Retry-After by default. You only need to write your own loop if you’re calling the REST API with a custom HTTP client.

What counts against the limit

Every authenticated request counts, including ones that fail:
  • Successful 2xx responses ✓
  • 4xx client errors (wrong field, not found) ✓
  • 429 responses themselves ✓
  • 5xx server errors ✓
  • Preflight OPTIONS requests ✗ (not counted)
In particular, a burst of 422 validation errors during development can eat into your window the same way as a burst of successful calls. If your integration is hammering the API in a loop because of a bug, fix the bug rather than raising the limit.

Reducing your request rate

If you’re brushing up against the ceiling, the usual culprits and their fixes:
  • Per-request session verification. If you’re calling GET /v1/sessions/:sessionId on every authenticated API hit, you’re doing too much work. Cache the verified sealed token locally (the token already carries the verdict), and only fall back to the durable session API for audits, escalations, or stale sessions. See API abuse → Session reuse.
  • Manual pagination. Many integrations iterate lists by page in a for loop. Switch to the server SDK’s auto-pagination helper, which streams items and respects retries automatically.
  • Unbatched management calls. If you’re provisioning many API keys or organizations, space the calls out with backoff rather than firing them concurrently.
  • Polling that could be webhooks. Gate session polling uses short-lived gtpoll_* tokens with their own caps. If you’re polling for Gate approval in a tight loop, look at the recommended polling cadence documented on Gate Signup sessions.

Raising a limit

Production integrations with sustained, legitimate load can have organization-level limits raised above the defaults. Contact support with:
  • The organization you want raised
  • Whether the increase is for publishable keys, secret keys, or both
  • Your expected steady-state and peak RPS
  • The endpoints you need headroom on
Publishable-key ceilings can generally go significantly higher than secret-key ceilings, since pk_* serves browser traffic.

What’s next

Errors

The full error envelope, including 429 details.

Pagination

Auto-pagination and filters — the main way to reduce request count.

API abuse

Session reuse patterns that reduce server-side verification rate.

Authentication

Organization defaults and key lifecycle.