How to bypass anti-bots in 2026: 6 methods that work

15 min read Original article ↗
Bypass

You write a scraper. It runs fine on your test pages. Then you point it at a real target and get a 403 before a single byte of HTML comes back.

That wall is an anti-bot system. Cloudflare alone sits in front of roughly a fifth of the web, and DataDome, Akamai, and Kasada cover most of what's left worth scraping.

I'll walk through six methods to bypass anti-bots, from a header fix you can ship in two minutes to defeating the protocol-level detection that breaks every Playwright fork.

Every method here is something you run and control. No black-box scraping API where you paste a URL and pray. You own the stack, so you can debug it when a target changes.

How anti-bot detection works in 2026

To bypass anti-bots in 2026, your scraper has to look like a real browser across four layers at once: the TLS handshake, the HTTP/2 frame order, the JavaScript fingerprint, and your behavior. Miss one layer and you're flagged, even if the other three are perfect. The trick is matching only the layers your target actually checks.

Here's what each layer inspects.

TLS fingerprint. The moment you open an HTTPS connection, your cipher suites and extension order form a hash (JA3, and now JA4). Python's requests produces a hash that screams "script."

HTTP/2 fingerprint. Real browsers send frames and pseudo-headers in a specific order. Default HTTP clients don't, so the request gets flagged before any header is even read.

JavaScript fingerprint. Once the page loads, scripts read navigator.webdriver, your canvas, WebGL renderer, audio context, and installed fonts. Headless Chrome leaks automation markers all over this layer.

Behavior. Real users don't pull 50 pages in 10 seconds from one IP. Rate, timing, and navigation order all feed a risk score.

JA4 arrived in 2023 and made fingerprinting harder to dodge. It sorts TLS extensions before hashing, which kills the old trick of randomizing extension order to slip past JA3.

The takeaway: there's no single switch. You match the layers a given target gates on, and nothing more.

6 methods to bypass anti-bots

Here's the full lineup, easiest to hardest. Start at the top and only escalate when you actually hit a wall.

Method Difficulty Cost Best for Success rate
1. Fix headers and request shape Easy Free Light protection, internal APIs Low–Medium
2. Match the TLS fingerprint (curl_cffi) Easy Free Network-layer gates, no JS needed Medium–High
3. Rotate residential proxies Medium $ IP reputation and rate blocks Medium–High
4. Stealth browser (Camoufox / Patchright) Medium Free JavaScript challenges, Turnstile High
5. Defeat CDP detection (nodriver) Hard Free Targets that catch every automated browser High
6. Human behavior + session warming Hard Free Behavioral scoring at scale High

Quick recommendation: if the data shows up in the raw HTML, start with method 2. If the page needs JavaScript to render, jump to method 4.

Basic methods (start here)

1. Fix your headers and request shape

The cheapest win. Most blocked-on-day-one scrapers are sending a header set no browser would ever produce.

Best for: Light protection, internal JSON APIs Difficulty: Easy Cost: Free Success rate against anti-bots: Low–Medium

How it works

A default requests call sends almost no headers and a dead-giveaway User-Agent. You want the full set a real Chrome session sends, in roughly the right order.

Implementation

Send the headers a browser actually sends, including Accept, Accept-Language, and Sec-Fetch-*.

import requests

# A realistic Chrome header set, not just a spoofed User-Agent
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                  "AppleWebKit/537.36 (KHTML, like Gecko) "
                  "Chrome/138.0.0.0 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept-Encoding": "gzip, deflate, br",
    "Sec-Fetch-Site": "none",     # set "same-origin" once you have a referrer
    "Sec-Fetch-Mode": "navigate",
    "Connection": "keep-alive",
}

resp = requests.get("https://example.com", headers=headers, timeout=20)
print(resp.status_code)

Watch for one thing: this fixes the application layer but not the TLS handshake. A site that fingerprints TLS will still block you here, which is exactly what method 2 fixes.

Pros and cons

Pros:

  • Two-minute change, works on a surprising number of sites
  • No new dependencies
  • Fast and cheap to run

Cons:

  • Useless against any system that checks TLS or JS
  • Headers alone are a weak signal in 2026

Use this when the target is lightly protected or you're hitting an undocumented internal API. Skip it the moment you see a Cloudflare challenge page.

2. Match the TLS fingerprint with curl_cffi

This is where most people should actually start. It fixes the layer that gets you blocked before your headers are even read.

Best for: Network-layer gates where the data lives in raw HTML
Difficulty: Easy
Cost: Free
Success rate against anti-bots: Medium–High

How it works

curl_cffi wraps curl-impersonate and replicates a real browser's TLS handshake, JA3/JA4 hash, cipher order, and HTTP/2 frame order. The API mirrors requests, so migration is nearly copy-paste.

Implementation

Install with pip install curl_cffi, then pass impersonate to copy a real Chrome fingerprint.

from curl_cffi import requests

# impersonate="chrome" copies the latest Chrome TLS + HTTP/2 fingerprint
resp = requests.get(
    "https://example.com",
    impersonate="chrome",
    timeout=20,
)

print(resp.status_code)
print(resp.text[:300])

Notice you didn't touch headers. curl_cffi handles the full network signature, which is the part requests can never get right.

It also keeps sessions, so cookies and connection reuse survive across requests.

from curl_cffi import requests

session = requests.Session()
session.get("https://example.com/login", impersonate="chrome")
# cookies from the first call carry into the second
resp = session.get("https://example.com/dashboard", impersonate="chrome")
print(resp.status_code)

The catch: this still can't run JavaScript. If the target gates on a JS challenge or renders content client-side, you'll get the challenge HTML, not your data. That's the signal to move up to a browser.

Pros and cons

Pros:

  • Beats TLS and HTTP/2 fingerprinting, the two layers headers can't fix
  • Almost as fast as plain requests
  • Drop-in replacement, so existing code barely changes

Cons:

  • No JavaScript execution
  • Fingerprints lag new browser releases by a version or two

Use curl_cffi for any target whose data is in the initial HTML response. For a deeper walkthrough, see our guide to web scraping in Python.

3. Rotate residential proxies

A perfect fingerprint won't save you if every request comes from one datacenter IP. IP reputation is its own detection layer.

Best for: IP bans, rate limiting, geo-gated content
Difficulty: Medium
Cost: $ (proxy bandwidth)
Success rate against anti-bots: Medium–High

How it works

Anti-bots score IPs by ASN and request rate. Datacenter ranges carry a poor reputation; residential IPs look like ordinary home connections. Rotating across a pool spreads your requests so no single IP trips a rate threshold.

Implementation

Point any client at a rotating endpoint. Here it is with curl_cffi from method 2.

from curl_cffi import requests

# a rotating residential endpoint hands you a fresh IP per request
proxies = {
    "http": "http://user:pass@gate.roundproxies.com:8000",
    "https": "http://user:pass@gate.roundproxies.com:8000",
}

resp = requests.get(
    "https://example.com",
    impersonate="chrome",
    proxies=proxies,
    timeout=30,
)
print(resp.status_code)

I run a rotating residential pool for anything past a few hundred requests, since datacenter IPs get burned fast on protected targets. Roundproxies is what I reach for, but any reputable residential network works the same way.

One gotcha: rotating too aggressively can hurt you. If a site ties a session to one IP, jumping IPs mid-session looks broken. Pin a "sticky" session for stateful flows, rotate for stateless ones.

Pros and cons

Pros:

  • Solves IP bans and rate limits that no fingerprint fix can touch
  • Lets you parallelize across many IPs
  • Works with every other method here

Cons:

  • Costs money per gigabyte
  • Sticky vs. rotating is a decision you have to get right

Use residential proxies once you're scraping at volume or seeing 429s. Pair them with method 2 or 4. More detail in our proxy rotating and residential proxies explainer.

4. Use a stealth browser for JavaScript challenges

When a target runs a JS challenge or renders content client-side, you need a real browser that doesn't leak automation markers. In 2026, that means Camoufox or Patchright, not the old stealth plugins.

Best for: JavaScript challenges, Turnstile, client-rendered pages
Difficulty: Medium
Cost: Free
Success rate against anti-bots: High

How it works

Camoufox is a patched Firefox built for scraping. It spoofs canvas, WebGL, fonts, and screen properties, and adds human-like cursor movement. Firefox helps here because most anti-bots tune their hardest checks for Chromium.

A quick deprecation note, because the old advice is everywhere: puppeteer-extra-plugin-stealth was deprecated in February 2025 and current Cloudflare checks detect it. undetected-chromedriver is in the same boat. Don't waste a day on either in 2026.

Implementation

Install with pip install camoufox[geoip], then run python -m camoufox fetch to pull the browser.

from camoufox.sync_api import Camoufox

# humanize adds realistic cursor motion; os spoofs a macOS fingerprint
with Camoufox(headless=True, humanize=True, os="macos") as browser:
    page = browser.new_page()
    page.goto("https://example.com", timeout=30000)
    page.wait_for_load_state("networkidle")  # let the challenge resolve
    html = page.content()
    print(len(html))

The API is just Playwright, so any Playwright code you already have ports over by swapping the launch line.

For a Cloudflare Turnstile checkbox, you need cross-origin iframes to be clickable, which disable_coop=True handles.

from camoufox.sync_api import Camoufox

with Camoufox(disable_coop=True, humanize=True, window=(1280, 720)) as browser:
    page = browser.new_page()
    page.goto("https://example.com")
    page.wait_for_load_state("networkidle")
    page.wait_for_timeout(4000)   # give Turnstile time to settle
    page.mouse.click(210, 290)    # click the checkbox at its rendered spot
    page.wait_for_timeout(3000)

If you'd rather stay on Chromium, Patchright is the maintained patched-Playwright option. Install it (pip install patchright), run patchright install chromium, and launch with channel="chrome" to use your real Chrome build.

Pros and cons

Pros:

  • Runs JavaScript, so client-rendered sites work
  • Beats most JS fingerprinting out of the box
  • Handles many Turnstile and JS challenges unattended

Cons:

  • Slow and memory-hungry next to curl_cffi
  • Each browser instance eats real RAM, so scaling costs you

Use a stealth browser when the data only exists after JavaScript runs, or when method 2 returns a challenge page. For Cloudflare specifically, see our Cloudflare bypass guide.

Advanced methods (for tough cases)

5. Defeat automation-protocol detection with nodriver

Here's the wall that trips up almost everyone: some targets don't fingerprint your TLS or your JS. They detect the automation protocol controlling the browser. Every Playwright fork, Camoufox and Patchright included, fails this check, because the detection looks at how the browser is being driven.

Best for: Targets that block every automated browser but work fine when you click manually
Difficulty: Hard
Cost: Free
Success rate against anti-bots: High

How it works

Playwright and Selenium expose traces of the DevTools Protocol they ride on. nodriver is the successor to undetected-chromedriver and uses its own custom DevTools implementation, so it isn't driven through the standard automation interface that detectors look for.

An independent 2026 benchmark of seven stealth tools across dozens of Cloudflare targets found the same split: TLS and JS layers fall to Camoufox and curl_cffi, but automation-protocol targets are a cliff where Playwright forks fail regardless of how well they're patched (Paterson, 2026).

Implementation

Install with pip install nodriver. It's async and needs only Python plus a Chrome-based browser.

import nodriver as uc

async def main():
    # headful (headless=False) passes more checks than headless
    browser = await uc.start(headless=False)
    page = await browser.get("https://example.com")

    await page.select("h1")          # waits until the element appears
    html = await page.get_content()
    print(len(html))

    browser.stop()

uc.loop().run_until_complete(main())

Notice there's no Selenium and no ChromeDriver. nodriver talks to Chrome directly, which is the whole reason it slips past automation-protocol checks.

To route it through proxies, start the browser with a proxy argument.

import nodriver as uc

async def main():
    browser = await uc.start(
        headless=False,
        browser_args=["--proxy-server=http://gate.roundproxies.com:8000"],
    )
    page = await browser.get("https://example.com")
    print(await page.get_content()[:200])
    browser.stop()

uc.loop().run_until_complete(main())

The tradeoff: nodriver is younger than Playwright, so some conveniences are still maturing and the docs are thin. You trade polish for the one thing it does that nothing else free does well.

Pros and cons

Pros:

  • Passes automation-protocol detection that breaks every Playwright fork
  • No Selenium, no ChromeDriver, minimal setup
  • Async by default, so it scales across tabs

Cons:

  • Less mature than Playwright; expect rough edges
  • Smaller community, sparser examples

Use nodriver only when a target blocks your stealth browser but works fine in a normal manual session. That asymmetry is the fingerprint of automation-protocol detection.

6. Add human behavior and session warming

You can have a flawless fingerprint and still get blocked on request number 40. Behavior is the last layer, and it's the one people forget.

Best for: Behavioral scoring, scraping at volume
Difficulty: Hard
Cost: Free
Success rate against anti-bots: High

How it works

The pattern that breaks most scrapers is treating every request as stateless. Real users land on a homepage, pick up cookies, then navigate. They pause. They don't fire identical requests on a metronome.

Session warming means visiting the homepage first so you carry a real referrer and cookie set into your target page.

Implementation

Warm the session, then add randomized pacing between requests.

import random
import time
from curl_cffi import requests

session = requests.Session()

# 1. land on the homepage to collect cookies and a referrer
session.get("https://example.com/", impersonate="chrome")
time.sleep(random.uniform(2, 5))

# 2. now hit the real target like a returning visitor
for page_num in range(1, 6):
    url = f"https://example.com/listings?page={page_num}"
    resp = session.get(url, impersonate="chrome")
    print(page_num, resp.status_code)
    time.sleep(random.uniform(3, 8))  # human-ish gap, never fixed

The key is random.uniform, not a constant sleep. Fixed delays are their own pattern, and detectors catch the metronome fast.

In a browser, the humanize=True flag from method 4 covers cursor movement; you still add the navigation pacing yourself.

Pros and cons

Pros:

  • Stops the slow blocks that fire after N requests
  • Costs nothing, just wall-clock time
  • Stacks on top of every other method

Cons:

  • Slows your throughput on purpose
  • Tuning the right delay is trial and error per target

Use behavioral pacing on anything you scrape at volume. It's the difference between a scraper that dies in an hour and one that runs for weeks.

Which method should you use to bypass anti-bots?

The mistake is grabbing the heaviest tool first. A headless browser to scrape a static JSON endpoint is slow, fragile, and overkill. Match the method to the layer your target actually checks.

This is the table I wish every other guide led with. Find your symptom on the left, read across to the fix.

Detection layer What it checks How you know you hit it Tool that beats it
TLS fingerprint Cipher and extension order in the handshake (JA3/JA4) 403 before any HTML loads curl_cffi (impersonate), Camoufox
HTTP/2 fingerprint Frame and pseudo-header order 403 even with perfect headers curl_cffi, any real browser
JavaScript fingerprint navigator.webdriver, canvas, WebGL, fonts Challenge page or block after the page loads Camoufox, Patchright
Automation protocol Traces of DevTools driving the browser Block on automation, fine when you click manually nodriver
IP reputation Datacenter ASN, request rate 429, or a block that clears on your home IP Residential proxies
Behavior Timing, mouse, navigation order Block after a clean run of N requests humanize + delays + warming

Here's the decision path I run on a new target.

Does the data show up in the raw HTML?
├── Yes → curl_cffi (impersonate="chrome") + residential proxies
│         Blocked? → it checks JS even for data. Go to a browser.
└── No (needs JavaScript) → Does it block on the very first load?
        ├── Yes → TLS/JS fingerprint gate → Camoufox or Patchright
        └── Only after a few requests → behavior/IP gate
                 → add proxies, warming, and random delays
                 Still blocked on automation but fine when you click by hand?
                 └── automation-protocol detection → switch to nodriver

Start at the cheapest method that covers your layer. Escalate only when you've confirmed the simpler one fails.

Troubleshooting common issues

"403 Forbidden" on the very first request

What it means: You were blocked at the network layer, before headers mattered. This is almost always TLS fingerprinting.

How to fix it: Swap requests for curl_cffi with impersonate="chrome" (method 2). If you're already using it, move to a real browser; the target is checking JS too.

"429 Too Many Requests"

What it means: Rate limit. Your fingerprint is probably fine, but you're hitting one IP too hard.

How to fix it: Add residential proxy rotation (method 3) and randomized delays (method 6). Back off exponentially on repeated 429s instead of hammering.

Stuck in a challenge loop

What it means: The JS challenge runs but never clears, usually because cookies aren't persisting or the page isn't fully loading.

How to fix it: Use a session so cookies carry over, and wait_for_load_state("networkidle") so the challenge finishes before you read the HTML. In Camoufox, give Turnstile a few extra seconds.

Blocked only when automated, fine in a manual browser

What it means: Automation-protocol detection. Your fingerprint is good, but the way the browser is driven gives you away.

How to fix it: Switch to nodriver (method 5). This is the one symptom that points to exactly one fix.

General debugging tips

Test against a fingerprint checker like tls.browserleaks.com to see what you're actually sending. Change one variable at a time. And watch for pattern changes; targets update detection constantly, so a method that worked last month can quietly break.

A note on responsible use

Bypassing an anti-bot doesn't make scraping legal or ethical by default. Before you run any of this at scale, think it through.

Most sites prohibit scraping in their terms of service, and bypassing protection can run into the Computer Fraud and Abuse Act in the US or GDPR rules in the EU. None of this is legal advice; check your own situation.

Scrape public data for legitimate purposes. Respect rate limits even when you can blow past them, cache aggressively to cut requests, and stay off personal data and government, health, or financial systems. If a site offers an API, use it; it's faster and you won't be playing this game at all.

Bypassing anti-bots in 2026: quick reference

Anti-bots are good, but they're not unbeatable. The whole game is matching your method to the layer your target checks, and nothing heavier

Situation Start with
Data in raw HTML, light protection Method 1, then 2
Network-layer block, no JS needed Method 2 + 3
JavaScript challenge or Turnstile Method 4
Blocks every automated browser Method 5
Slow blocks at volume Method 6

My honest default: curl_cffi first, because it's fast and beats the layer most people get blocked on. Add residential proxies the moment you scale. Reach for a browser only when JavaScript forces it, and reach for nodriver only when a browser still gets caught.

Pick the lightest thing that works. Your future self, debugging this at 2am when a target changes its detection, will thank you.

This article was originally published in April 2025, written by Marius Bernard. It was most recently updated in June 2026.

Get productivity tips delivered straight to your inbox

We'll email you 1-3 times per week—and never share your information.

Related from Knowledge Base