Browser fingerprinting techniques

Browser fingerprinting is a stack of roughly a dozen techniques sitting at different layers of the page load, each producing low-entropy signals that combine into something high-entropy. This article walks through each one with enough detail that an engineer building or evaluating a fingerprinting system has a working mental model.

If you are new to the topic, start with what is a browser fingerprint for the conceptual frame. The code-first companion is device fingerprinting in JavaScript, which implements most of these probes as runnable collectors. This post assumes you already know the goal and want to know how the work actually gets done.

How techniques are usually grouped

Three buckets are useful in practice.

Passive signals. Things the browser sends without being asked, on every request. HTTP headers, the User-Agent, TLS handshake bytes, Sec-Fetch and Client Hints. The server collects these whether or not JavaScript runs.

Active JavaScript probes. Code that asks the browser to do something, then reads back the result. Canvas rendering, WebGL queries, AudioContext synthesis, font enumeration, performance timing. These produce the largest single chunks of entropy.

Interaction and behavioral signals. Things only observable while the user is on the page. Mouse trajectories, scroll dynamics, keyboard cadence, copy-paste timing. These will not identify a device, but they tell you whether a human is driving it.

A production fingerprint pulls from all three and treats each layer as evidence that has to be internally consistent.

Passive signals

User-Agent and HTTP headers

The User-Agent string is the oldest fingerprinting input and the least reliable one. It is trivially spoofable, and Chrome’s User-Agent Reduction initiative has been collapsing it toward a minimal frozen form since 2022. By 2026 a freshly installed Chrome reports a string that is almost identical across versions, with the per-version detail moved into Client Hints.

What still helps is the combination of headers and the order they arrive in. The Accept, Accept-Language, and Accept-Encoding lists, the casing of header names, the inclusion or omission of headers like DNT and Upgrade-Insecure-Requests, and the order Chrome sends Connection versus Cache-Control are all things a careless automation framework gets wrong. A current Chrome on macOS (137 at the time of writing) sends a specific header tuple that a Python requests call cannot reproduce without effort.

Sec-Fetch headers

The Sec-Fetch-Site, Sec-Fetch-Mode, Sec-Fetch-Dest and Sec-Fetch-User headers were introduced as a CSRF defense, and they happen to be very good fingerprinting and bot-detection inputs. They describe how the request was initiated, where from, and what kind of resource is expected. A POST to your login endpoint that arrives without Sec-Fetch-User: ?1 did not come from a real form submission inside a Chromium browser. Headless and scripted clients regularly forget to set them, which is why these headers feature prominently in headless browser detection. Sec-Fetch checks alone catch a meaningful fraction of low-effort automation (Sec-Fetch and Client Hints inconsistencies in headless browsers, Sicuranext).

User-Agent Client Hints

Client Hints are Chrome’s replacement plan for the User-Agent string. The browser sends low-entropy hints by default (Sec-CH-UA, Sec-CH-UA-Mobile, Sec-CH-UA-Platform) and gates higher-entropy details (Sec-CH-UA-Arch, Sec-CH-UA-Bitness, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Model, Sec-CH-UA-Platform-Version) behind an explicit Accept-CH opt-in from the server (MDN: HTTP Client Hints).

From a fingerprinting standpoint, Client Hints are interesting for two reasons. The high-entropy values, when requested, give you precise platform and version information that the reduced User-Agent string no longer carries. And the consistency between Client Hints and other signals catches automation: a headless Chromium that claims Sec-CH-UA-Platform: "Windows" while reporting a TLS fingerprint typical of Linux is failing a cross-check that a real browser would never fail.

The JavaScript equivalent is navigator.userAgentData.getHighEntropyValues(...). This is gated and returns a Promise, but the values it returns are the same ones that flow into the headers (MDN: NavigatorUAData.getHighEntropyValues()).

TLS fingerprinting: JA3 and JA4

TLS fingerprinting happens before any HTTP byte is exchanged. The client sends a ClientHello that lists the TLS versions it supports, the cipher suites it offers, the extensions it sends and the order of those extensions. Real browsers have distinctive ClientHello shapes. Standard libraries in Python, Go, Node and curl have their own. So do automation stacks built on top of them.

JA3 was the first widely-adopted format. It concatenates the version, cipher list, extension list, elliptic curves and elliptic curve point formats from the ClientHello and hashes the result. JA4 is the newer standard. It addresses two problems with JA3: it sorts the extension list (which Chrome started randomising in 2023 to defeat naive JA3 fingerprinting), and it splits the fingerprint into a structured prefix (protocol, version, ALPN) plus a hash, which is easier to reason about and harder to false-positive on (packet.guru, TLS Fingerprinting JA3 JA4; Cloudflare bots: JA3/JA4 fingerprint).

JA4+ extends the family. JA4H fingerprints HTTP headers, JA4S fingerprints the server side of the TLS handshake, and JA4X covers X.509 certificates. The combination lets you fingerprint not just the client library, but the whole network stack and (when you control the server) verify that the server responses you logged came from the infrastructure you think they did. TLS fingerprinting covers the formats, the spoofing tools, and the limits in depth.

The reason TLS fingerprinting matters for bot detection is that it is the earliest signal available, and it is one of the few signals that JavaScript-level spoofing cannot touch. A bot author can set navigator.userAgent to anything they like, but they cannot rewrite the cipher suite list emitted by their underlying language runtime without significant work. Tools like curl-impersonate and tls-client exist precisely because that work is non-trivial.

IP, ASN and infrastructure

The IP address is not part of the browser, but every production fingerprinting pipeline treats it as part of the session representation. The ASN, the hosting classification (datacenter, residential ISP, mobile carrier, VPN, anonymising proxy), the geographic country implied by the IP, and the gap between IP-implied geography and the browser-reported time zone are all signals.

This layer is also where datacenter-proxy detection lives. A browser claiming to be a residential user from Texas that arrives from an AWS us-east-1 IP is making a claim the network layer disagrees with. We cover that detail in datacenter proxy detection.

Active JavaScript probes

Canvas fingerprinting

Canvas is the single highest-entropy probe in 2026, and the technique itself is short.

const canvas = document.createElement('canvas');
canvas.width = 220;
canvas.height = 60;
const ctx = canvas.getContext('2d');

ctx.textBaseline = 'top';
ctx.font = '14px Arial';
ctx.fillStyle = '#f60';
ctx.fillRect(125, 1, 62, 20);
ctx.fillStyle = '#069';
ctx.fillText('Cwm fjordbank glyphs vext quiz, \u{1F600}', 2, 15);
ctx.fillStyle = 'rgba(102, 204, 0, 0.7)';
ctx.fillText('Cwm fjordbank glyphs vext quiz, \u{1F600}', 4, 17);

const dataUrl = canvas.toDataURL();

canvas.toDataURL() returns a base64 PNG of the rendered output. Two devices render that string differently because the GPU, the GPU driver, the OS font rasterizer, the installed font set, the antialiasing settings and the subpixel rendering geometry are all in the loop. Hashing the data URL gives you a value that is stable across page loads on one device and varies meaningfully across devices.

Why this works is more subtle than it looks. The Unicode test string at the end (an emoji or an unusual diacritic combination) forces the browser to fall back to whatever emoji font is installed, which is platform-specific. The translucent overlay forces blending math, which depends on the GPU. The fillRect provides a flat-color anchor that catches gamma-correction differences. Production canvas fingerprints use longer, more carefully chosen strings to maximize entropy. The same logic applies.

Canvas fingerprinting covers this in more depth, including how spoofing works and how to detect it.

WebGL fingerprinting

WebGL is a 3D rendering API. From a fingerprinting standpoint there are two angles.

The first is metadata. The WEBGL_debug_renderer_info extension lets you read UNMASKED_RENDERER_WEBGL and UNMASKED_VENDOR_WEBGL, which are strings like ANGLE (Apple, ANGLE Metal Renderer: Apple M2 Pro, Unspecified Version). The exact form of these strings is platform-specific and high-entropy.

const canvas = document.createElement('canvas');
const gl = canvas.getContext('webgl') || canvas.getContext('experimental-webgl');
const dbg = gl.getExtension('WEBGL_debug_renderer_info');

const renderer = dbg ? gl.getParameter(dbg.UNMASKED_RENDERER_WEBGL) : null;
const vendor = dbg ? gl.getParameter(dbg.UNMASKED_VENDOR_WEBGL) : null;
const params = {
  maxAnisotropy: gl.getParameter(gl.getExtension('EXT_texture_filter_anisotropic')?.MAX_TEXTURE_MAX_ANISOTROPY_EXT) ?? null,
  maxTextureSize: gl.getParameter(gl.MAX_TEXTURE_SIZE),
  maxRenderbufferSize: gl.getParameter(gl.MAX_RENDERBUFFER_SIZE),
  shadingLanguageVersion: gl.getParameter(gl.SHADING_LANGUAGE_VERSION),
};

Firefox and Safari have privacy-mode behavior that masks UNMASKED_RENDERER_WEBGL, but the rest of the parameters are still readable and combine into a useful signal.

The second angle is active rendering. Drawing a 3D scene with a specific shader and reading back the pixel buffer produces a hash that depends on the GPU’s actual computation. WebGL renders are higher-entropy than canvas renders for current-generation hardware because the floating-point fragment shader output exposes more variation than the 2D context’s text rasterizer does.

AudioContext fingerprinting

The trick is to ask the browser to synthesize audio in an offline context, then read back the samples. The hardware-specific floating-point math in the audio pipeline produces results that vary by device.

const ctx = new (window.OfflineAudioContext || window.webkitOfflineAudioContext)(1, 44100, 44100);
const oscillator = ctx.createOscillator();
oscillator.type = 'triangle';
oscillator.frequency.value = 10000;

const compressor = ctx.createDynamicsCompressor();
compressor.threshold.value = -50;
compressor.knee.value = 40;
compressor.ratio.value = 12;
compressor.attack.value = 0;
compressor.release.value = 0.25;

oscillator.connect(compressor);
compressor.connect(ctx.destination);
oscillator.start(0);

ctx.startRendering().then((buffer) => {
  const samples = buffer.getChannelData(0);
  // sum a small slice for the fingerprint
  let fp = 0;
  for (let i = 4500; i < 5000; i++) fp += Math.abs(samples[i]);
});

The reason this works is that the DynamicsCompressor node’s output depends on the underlying audio implementation, which differs between Chrome, Firefox and Safari, and within each browser differs by platform and (occasionally) by build.

Safari 17+ adds randomisation in Private Browsing mode that defeats this technique on that surface specifically. Chrome and Firefox on default settings remain measurable.

Font enumeration

There is no JavaScript API to list installed fonts directly. You discover them by rendering. The technique measures the bounding box of a sample string in a candidate font and compares it to the bounding box in a known fallback font (typically monospace, serif, sans-serif). If the box differs, the candidate font is installed. If it matches the fallback exactly, it is not.

Modern implementations do this with the canvas API or the FontFace API and a list of a few hundred candidate fonts. The result is a sorted list of present fonts, hashed. The presence of Times New Roman and Arial is universal. The presence of Skia, LiHei Pro, Hiragino Kaku Gothic Pro or Segoe UI Variable tells you something specific about the platform and locale.

Newer privacy-conscious browsers (Safari with Lockdown Mode, Tor Browser, Brave with strict shields) restrict the set of fonts visible to the page to a predefined system list to defeat this technique. Most production browsers do not.

Screen, viewport and device pixel ratio

screen.width, screen.height, screen.availWidth, screen.availHeight, screen.colorDepth, window.devicePixelRatio and window.matchMedia('(prefers-color-scheme: dark)').matches all contribute. Mobile devices have distinctive joint distributions of these values. So do laptops with retina displays.

screen.availHeight - screen.height reveals the height of the taskbar or menu bar, which differs across OS versions and configurations.

Hardware properties

A small set of navigator properties expose hardware:

navigator.hardwareConcurrency: number of logical CPU cores
navigator.deviceMemory: GiB of RAM, quantised to powers of 2
navigator.maxTouchPoints: 0 for non-touch, 5 or 10 on touch devices
navigator.platform: legacy string, still readable
navigator.languages: ordered list of preferred languages

Each is low-entropy individually. The joint distribution is more useful, and the cross-check with TLS and Client Hints (a TLS fingerprint typical of Linux paired with navigator.platform: 'Win32') catches spoofing.

WebRTC and local IP leakage

WebRTC’s STUN process exposes the host’s local IPs as ICE candidates. In a typical desktop setup this reveals the LAN IP behind the router, which is uncorrelated with the public IP and is therefore a useful cross-check against VPNs and proxies. A user who claims to be on a Texas residential IP but whose ICE candidates list a 10.0.0.x address consistent with a known cloud provider’s internal range is making a claim that does not hold up.

Most browsers now restrict local IP leakage by default unless the page has gone through the regular WebRTC permission flow, but the public IP visible through WebRTC TURN servers can still differ from the IP visible at the TCP layer, which is itself a signal.

Timing and clock skew

The browser exposes performance.now() with millisecond or sub-millisecond precision (clamped on most browsers for Spectre mitigation reasons). The drift between performance.now() and Date.now() over a few seconds, and the precision of the underlying clock, are detectable.

More usefully, the offset between the system clock reported by the browser and the server’s authoritative time is a stable per-device signal as long as the device’s NTP behavior is consistent. A device that is consistently +47ms off the server clock is recognizable.

Battery and sensor APIs

Where present, the Battery Status API and motion sensors (accelerometer, gyroscope) are very high-entropy. The Battery API was deprecated in most browsers in 2017 specifically because the joint distribution of charge level and discharge rate uniquely identified browsers across sessions. Motion sensors are still available on mobile devices and contribute meaningful entropy when allowed.

Behavioral and interaction signals

Pure device fingerprints answer “is this the same device?” Behavioral signals answer “is this a human?”

The list is long, but the main categories are:

Pointer dynamics. Mouse-move velocity profiles, jerk (rate of change of acceleration), micro-movements during dwell, idle drift. Humans produce trajectories with characteristic noise. Headless scripts produce straight lines and discrete jumps.
Touch dynamics. Multi-touch pressure curves, touch radius, finger arrival order on a multi-finger gesture, the angle between consecutive taps.
Keyboard dynamics. Inter-keystroke timing, the distribution of dwell times per key, the symmetry of keydown and keyup events for repeated keys. Bots that send synthetic keystroke events have flat, regular distributions that humans do not.
Scroll and focus. Scroll-velocity envelopes, time spent at idle, focus and blur transitions, tab visibility changes.
Form behavior. Whether a field was clicked or tabbed into, whether values were typed or pasted in, whether the page received a mousemove event before the submit fired.

None of these are unique to a person, but they cluster very cleanly into human-like and bot-like populations, and they are extremely cheap to collect.

Cross-checks: why the joint matters

The single most important insight from a decade of fingerprinting research is that the signals matter less than the cross-checks between them. A skilled attacker can spoof any individual value, but spoofing every value in a way that holds up across layers is far harder.

Concrete examples of cross-checks that catch real automation:

TLS vs User-Agent. The TLS fingerprint says this is Go’s net/http standard library. The User-Agent says this is a current Chrome on macOS. Those facts cannot both be true.
Platform vs hardware. navigator.platform says Windows. hardwareConcurrency is 8, deviceMemory is 8, the GPU renderer string is Apple M2. The hardware says Mac.
Timezone vs IP geography. The browser says America/New_York. The IP geolocates to Bangalore. Either the user is traveling or the proxy is fronting for a local operator.
Sec-Fetch vs request shape. The request hit /api/login as a POST with Sec-Fetch-Site: none. A real form submission from the same origin would have sent same-origin. This request did not come from the page.
Client Hints vs JS environment. Sec-CH-UA-Platform-Version: "14.7.0". navigator.userAgent says macOS 13. The two disagree because one of them was spoofed.

The fingerprinting value is in the cross-checks. Production systems weight them as separate signals and combine them probabilistically. That is the difference between a fingerprint that catches a stealth-mode Puppeteer and a fingerprint that does not.

Common mistakes when implementing

We see three patterns repeatedly in homegrown fingerprinting code.

Hashing too aggressively. Collapsing 30 signals into one SHA-256 is convenient, and it throws away every cross-check you might have wanted. Store the components.

Trusting the client to score itself. Anything computed in the browser can be replayed. The fingerprint should be collected client-side and scored server-side, with the raw inputs preserved for audit.

Ignoring stability. A fingerprint that changes when Chrome updates is a fingerprint that does not work. Either tolerate per-attribute drift in your matching, or resolve to a stable server-side identity that survives small changes. The math is covered in what is a browser fingerprint.