> ## Documentation Index
> Fetch the complete documentation index at: https://usefoil.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# API abuse & scraping

> Stop API abuse and LLM scraping while allowing legitimate crawlers. Verify Foil sessions across API calls and split policy by attribution category.

<Info>
  APIs are different from form-submit surfaces. You don't want a sealed handoff per request — you want one Foil session that covers a user's browsing session, reused across many API calls. And you want to distinguish LLM scrapers (`ai-agent`) from legitimate crawlers (`verified-bot`, `crawler`), because the right answer for those is different.
</Info>

## The threat

Two problems get lumped together as "API abuse," and they want different responses:

* **LLM scraping.** Cloud agents like browser-use and computer-use, plus self-hosted equivalents, scrape the web on behalf of model training, retrieval pipelines, or end-user automation. They look like browsers because they *are* browsers, just driven by an agent loop. Foil attributes these as `ai-agent`.
* **Scripted scraping.** Classic headless Chrome or HTTP-level clients iterating your listing endpoints, pricing APIs, or search results. `automation` attribution.

Against both, you want something stronger than a per-user-account rate limit — attackers rotate accounts — and stronger than a per-IP rate limit — attackers rotate residential proxies. The durable visitor fingerprint is the axis that holds.

At the same time, you almost certainly *do* want some non-human traffic:

* **Search crawlers** (Googlebot, Bingbot) — you want them indexing your public content.
* **Web-Bot-Auth–authenticated agents** — a growing set of AI products sign their requests with HTTP message signatures declaring themselves. Foil surfaces this as the `verified-bot` category and carries the domain/key identity through to your backend.
* **Your own internal tooling** — health checks, monitoring, analytics pipelines.

The integration on this page separates "block" from "allow" on the attribution category, not just the verdict.

## The flow

<Steps>
  <Step title="Start Foil once at app boot">
    For authenticated SPAs, start the client when the shell mounts. For public APIs accessed directly (no browser), see the bottom of this page.
  </Step>

  <Step title="Acquire a session once and reuse it">
    Call `getSession()` at the start of an API-consuming flow — not on every request. Pass `sessionId` as a header on subsequent XHRs.
  </Step>

  <Step title="Verify on the first protected request">
    Either verify a fresh sealed token, or look up the durable session via `GET /v1/sessions/:sessionId` and cache the verdict.
  </Step>

  <Step title="Re-verify on high-value actions or periodically">
    For mutations or sensitive reads, request a fresh handoff. For long-lived sessions, refresh the cached verdict every few minutes.
  </Step>

  <Step title="Split policy by attribution category">
    Allow `verified-bot` and `crawler`; block `automation` and `ai-agent`; treat `human` normally.
  </Step>
</Steps>

## Client integration

Request a sealed handoff at the start of a session, stash the session ID, and send it as a header on every API call.

```html theme={"dark"}
<script type="module">
  const foilPromise = import("https://cdn.usefoil.com/t.js").then(
    (Foil) =>
      Foil.start({
        publishableKey: "pk_live_your_publishable_key",
      }),
  );

  let sessionHandoffPromise = null;

  async function getFoilHandoff() {
    if (!sessionHandoffPromise) {
      sessionHandoffPromise = foilPromise.then((t) => t.getSession());
    }
    return sessionHandoffPromise;
  }

  async function apiFetch(path, init = {}) {
    const { sessionId, sealedToken } = await getFoilHandoff();
    const headers = new Headers(init.headers);
    headers.set("X-Foil-Session", sessionId);
    headers.set("X-Foil-Token", sealedToken);
    return fetch(path, { ...init, headers });
  }

  // Refresh the handoff for high-value mutations
  async function refreshHandoff() {
    const foil = await foilPromise;
    sessionHandoffPromise = Promise.resolve(foil.getSession());
    return sessionHandoffPromise;
  }
</script>
```

## Server verification: two patterns

For long-lived API sessions you have a choice. Either verify the sealed token on every request (cheap — it's a local crypto operation) and cache the decoded result, or call `GET /v1/sessions/:sessionId` once and cache the durable verdict for a few minutes.

### Pattern A: verify the sealed token per request

<CodeGroup>
  ```javascript Node.js theme={"dark"}
  const { safeVerifyFoilToken } = require("@abxy/foil-server");

  async function foilGuard(req, res, next) {
    const sealedToken = req.get("X-Foil-Token");
    if (!sealedToken) {
      return res.status(401).json({ error: "Missing Foil token" });
    }

    const result = safeVerifyFoilToken(sealedToken, process.env.FOIL_SECRET_KEY);
    if (!result.ok) {
      return res.status(401).json({ error: "Invalid Foil token" });
    }

    const { decision, attribution } = result.data;
    const category = attribution?.bot?.facets?.category?.value;

    // Allow verified bots and search crawlers on read endpoints
    if (req.method === "GET" && (category === "verified-bot" || category === "crawler")) {
      req.foilVerdict = { verdict: "allowed_bot", category };
      return next();
    }

    // Block automation and ai-agent regardless of method
    if (decision.verdict === "bot") {
      return res.status(403).json({ error: "Blocked" });
    }

    req.foilVerdict = { verdict: decision.verdict, category: category ?? null };
    next();
  }

  app.use("/api", foilGuard);
  ```

  ```python Python theme={"dark"}
  from foil_server import safe_verify_foil_token
  import os

  def foil_guard(request):
      sealed = request.headers.get("X-Foil-Token")
      if not sealed:
          return {"error": "Missing Foil token"}, 401

      result = safe_verify_foil_token(sealed, os.environ["FOIL_SECRET_KEY"])
      if not result.ok:
          return {"error": "Invalid Foil token"}, 401

      decision = result.data.decision
      attribution = result.data.attribution or {}
      category = (
          attribution.get("bot", {}).get("facets", {}).get("category", {}).get("value")
      )

      if request.method == "GET" and category in ("verified-bot", "crawler"):
          request.state.foil = {"verdict": "allowed_bot", "category": category}
          return None

      if decision.verdict == "bot":
          return {"error": "Blocked"}, 403

      request.state.foil = {"verdict": decision.verdict, "category": category}
      return None
  ```

  ```go Go theme={"dark"}
  import foil "github.com/abxy-labs/foil-server-go"

  func FoilGuard(next http.Handler) http.Handler {
      return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
          sealed := r.Header.Get("X-Foil-Token")
          if sealed == "" {
              http.Error(w, "Missing Foil token", 401)
              return
          }

          tr := foil.SafeVerifyFoilToken(sealed, os.Getenv("FOIL_SECRET_KEY"))
          if !tr.OK {
              http.Error(w, "Invalid Foil token", 401)
              return
          }

          // Read category from attribution.bot.facets.category.value
          category := botCategory(tr.Data)

          if r.Method == http.MethodGet && (category == "verified-bot" || category == "crawler") {
              next.ServeHTTP(w, r)
              return
          }

          if tr.Data.Decision.Verdict == "bot" {
              http.Error(w, "Blocked", 403)
              return
          }

          next.ServeHTTP(w, r)
      })
  }
  ```

  ```ruby Ruby theme={"dark"}
  require "foil/server"

  before "/api/*" do
    sealed = request.env["HTTP_X_FOIL_TOKEN"]
    halt 401, { error: "Missing Foil token" }.to_json unless sealed

    tr = Foil::Server::SealedToken.safe_verify_foil_token(sealed)
    halt 401, { error: "Invalid Foil token" }.to_json unless tr[:ok]

    decision  = tr[:data][:decision]
    category  = tr[:data].dig(:attribution, :bot, :facets, :category, :value)

    next if request.get? && %w[verified-bot crawler].include?(category)

    halt 403, { error: "Blocked" }.to_json if decision[:verdict] == "bot"
  end
  ```

  ```php PHP theme={"dark"}
  use Foil\Server\SealedToken;

  function foilGuard(): void {
      $sealed = $_SERVER["HTTP_X_FOIL_TOKEN"] ?? null;
      if (!$sealed) { http_response_code(401); exit; }

      $tr = SealedToken::safeVerify($sealed, getenv("FOIL_SECRET_KEY"));
      if (!$tr->ok) { http_response_code(401); exit; }

      $category = $tr->data->attribution["bot"]["facets"]["category"]["value"] ?? null;

      if ($_SERVER["REQUEST_METHOD"] === "GET" &&
          in_array($category, ["verified-bot", "crawler"])) {
          return;
      }

      if ($tr->data->decision["verdict"] === "bot") {
          http_response_code(403);
          exit;
      }
  }
  ```

  ```bash cURL theme={"dark"}
  # Durable readback for cached-verdict pattern (see below)
  curl https://api.usefoil.com/v1/sessions/sid_... \
    -H "Authorization: Bearer sk_live_..."
  ```
</CodeGroup>

### Pattern B: cache the durable verdict

For very hot endpoints, verify once per session and cache the result.

```javascript Node.js theme={"dark"}
const { Foil } = require("@abxy/foil-server");
const client = new Foil({ secretKey: process.env.FOIL_SECRET_KEY });
const verdictCache = new Map(); // sessionId → { verdict, category, expiresAt }

async function getCachedVerdict(sessionId) {
  const cached = verdictCache.get(sessionId);
  if (cached && cached.expiresAt > Date.now()) return cached;

  const session = await client.sessions.get(sessionId);
  const category = session.automation?.category ?? null;
  const entry = {
    verdict: session.decision.automation_status, // "automated" | "human" | "uncertain"
    category,
    expiresAt: Date.now() + 60_000, // 1 minute
  };
  verdictCache.set(sessionId, entry);
  return entry;
}
```

The durable readback returns `decision.automation_status` (`"automated" | "human" | "uncertain"`) rather than the sealed-token `verdict` field; see [Server verification](/server-verification) for the full shape.

## When to re-verify

Session reuse is great for throughput, but a long-lived session becomes a stale verdict. Three refresh triggers worth coding:

* **High-value mutations** — a user changing their email, deleting data, exporting an archive. Always ask for a fresh sealed handoff from the client.
* **Periodic refresh** — every 5–15 minutes for long sessions, re-fetch `GET /v1/sessions/:sessionId`. Foil's server-side behavioral scoring can move a verdict between `snapshot` and `behavioral` phases as evidence accumulates, and you want the latest.
* **Suspicious pattern observed** — your own application logic (burst of identical queries, geographic jump) can trigger a client-side `foil.getSession()` that produces a new, freshly-attested handoff.

## Splitting policy by attribution

For read APIs, the policy matrix that works on most sites:

| Category                       | GET               | POST / PUT / DELETE                     |
| ------------------------------ | ----------------- | --------------------------------------- |
| `human`                        | Allow             | Allow                                   |
| `verified-bot`                 | Allow             | Block (bots generally shouldn't mutate) |
| `crawler`                      | Allow             | Block                                   |
| `automation`                   | Block or throttle | Block                                   |
| `ai-agent`                     | Block or throttle | Block                                   |
| `unknown` (with `bot` verdict) | Throttle          | Block                                   |

"Allow" doesn't mean unlimited. Keep a generous rate limit on verified bots — a badly-written crawler can still hurt you — but don't return 403s.

If you want LLM agents to read your content but not scrape it, set different caps for `ai-agent`: a low QPS limit that's fine for an agent answering one user's question and painful for a training-data crawler.

## APIs called directly (no browser)

Some API consumers don't run a browser at all — a mobile app, a server-to-server integration, a CLI tool. Foil's browser SDK doesn't apply there. Two options:

* **Require pre-issued API keys** for non-browser traffic. Route traffic without a Foil session header down the API-key path, and apply Foil only to browser-origin requests.
* **Rely on network-level attribution.** Even without the browser bundle, Foil's HTTP edge sees JA4 TLS fingerprints, HTTP/2 SETTINGS, and Web-Bot-Auth signatures, and the durable session API can surface these for direct requests. See [Detection categories](/detection-categories) for what's available without the client SDK.

## What's next

<CardGroup cols={2}>
  <Card title="User-generated content" icon="messages-square" href="/use-cases/user-generated-content">
    The write-side counterpart: stop LLM posts at the composer.
  </Card>

  <Card title="Server verification" icon="shield-check" href="/server-verification">
    Reference for both sealed-token and durable-readback paths.
  </Card>

  <Card title="Detection categories" icon="list-tree" href="/detection-categories">
    What Foil detects, with and without the browser SDK.
  </Card>

  <Card title="Going to production" icon="rocket" href="/going-to-production">
    Rollout plan for API-wide enforcement.
  </Card>
</CardGroup>
