Anti-bot & legal · 8 min read

Bright Data v Meta 2026: what it means for your scrapers

Judge Chen's January 2024 SJ for Bright Data — Meta's ToS bind only logged-in users — survived 2025 without challenge. Meta dismissed the residual claim and waived appeal. The bigger 2025 docket is NYT v. OpenAI advancing past motion to dismiss.

By Chris Walker May 1, 2026 Scraping Case LAW

All articles

Apify · marketplace signal

Bright Data · vendor signal

On January 23, 2024, Judge Edward Chen of the Northern District of California granted summary judgment to Bright Data in the case Meta brought against the company over scraping and resale of public Facebook and Instagram data. The ruling held that Meta’s Terms of Service “do not bar logged-off scraping of public data” — meaning the ToS contractually bind only logged-in users, and a scraper accessing the public-facing surface without a session is not bound by them.

Meta dismissed its residual tortious-interference claim a month later and waived appeal. The ruling stands.

Two years on, the doctrine has survived. No follow-on case has materially narrowed it. The buyer-side conclusion held by enterprise procurement teams in 2024 — that resale of logged-out scraped public data is legally defensible in the US — held through 2025 and into 2026.

Chen’s actual holding

Meta’s case ran on contract, not the Computer Fraud and Abuse Act. By 2024 the statutory route against public-data scraping was a spent weapon, so Meta pleaded breach instead: Bright Data had agreed to Facebook and Instagram’s Terms of Service, the argument went, and then violated the anti-scraping provisions. The case turned on one contract question — who the terms bind — and both sides moved for summary judgment on it.

Chen read the terms the way a court reads any contract: by their words. Facebook and Instagram’s terms govern “your use” of the products, and a person who pulls a public page while logged out does not “use” the service in the contractual sense. He is a visitor, not a user, and the line is the one the terms themselves draw.

The decisive evidence was a deletion. Before 2009, Facebook’s terms stated that by merely “accessing” the site a person agreed to be bound, “whether or not [a] registered member.” Meta struck that clause. Chen treated the removal as intent: a drafter who deletes the one sentence binding non-members signals an agreement that binds members only. Under ordinary canons of construction, the logged-out scraper sits outside it.

The contract ruling did not dispose of the whole case by itself. Meta had also pleaded tortious interference — the theory that Bright Data induced Meta’s own logged-in users to breach the terms by supplying the data Bright Data resold. That count survived summary judgment on paper. Meta abandoned it anyway, dismissing the residual claim a month later and declining to appeal the contract holding. The choice is the tell. A plaintiff that thought its interference theory could carry the case would have pressed it; Meta instead read the contract ruling as dispositive of the only question that mattered commercially — whether Bright Data may sell logged-out public data — and walked rather than pay for discovery on a claim it expected to lose. The summary judgment became, in practice, the final word.

Logged-out doctrine The ruling turns on when platform terms attach, not on whether public data has value

The contract hook fails before the court reaches a broad theory of data ownership.

The perpetuity problem

The harder fact for Bright Data was that it had once held Facebook and Instagram accounts. A former user is still a user, and Meta argued the anti-scraping covenant survived account closure — that one click on “agree,” however brief, bound the company forever.

Chen rejected the perpetual reading. The court found Bright Data’s accounts “entirely incidental and unrelated to its scraping,” and held Facebook’s survival clause — which purported to bar scraping in perpetuity after termination — unenforceable. A reading under which “a user who had a Facebook account for a brief moment could be forever bound” was not one the court would adopt; a covenant that runs forever, with no durational limit, is the kind of perpetual obligation contract law disfavors. Instagram carried no survival clause at all, so on that platform the question never arose.

This is the part of the ruling most often dropped from the summary, and it is the part that makes the doctrine durable. A holding that protected only scrapers who had never touched the platform would be narrow and trivial to engineer around — a forced click-through would defeat it. A holding that also voids perpetual survival covenants reaches the harder case: the operator with a dormant account somewhere in its history. That describes most commercial operators.

From CFAA to contract

The ruling reads as the contract-law sequel to a decade of CFAA fights, and the same judge wrote both halves. Chen presided over hiQ Labs v LinkedIn, the case in which the Ninth Circuit twice affirmed that scraping public data does not “exceed authorized access” under the CFAA. The Supreme Court’s 2021 decision in Van Buren v United States narrowed the statute further, reading “exceeds authorized access” as a gates-up-or-down question about areas of a system a person may not enter at all — not about what an otherwise-authorized visitor does with the data. After hiQ and Van Buren, the federal anti-hacking statute no longer reached the public-data scraper.

Platforms migrated their theory to contract, where Register.com v Verio and a line of trespass-to-chattels cases had once given site operators leverage. Bright Data v Meta is the point where that migration stalls too. If the CFAA does not bind the logged-out scraper and the terms of service do not either, the platform is left with tort theories that demand proof of concrete harm — server load, interference with named business relationships — which surface-level public scraping rarely supplies. The same “information monopolies” concern Chen voiced in hiQ animates the contract reading: terms that would let a platform privatise public data by fiat get construed narrowly.

The contract front is not fully settled. hiQ itself lost at trial in 2024 on a contract theory — but because hiQ held an active account and a direct dealing history with LinkedIn that Bright Data never had with Meta. The Ninth Circuit’s post-hiQ doctrine now turns on exactly that distinction: public access is protected, a prior contractual relationship is not.

The doctrine’s actual reach

The Chen ruling has been followed by district courts in other Ninth Circuit cases involving similar fact patterns — public data, logged-out access, resale to third parties. There has been no published Ninth Circuit appellate ruling overturning or narrowing it. Meta has not refiled.

The doctrine’s reach is, however, narrower than it is sometimes characterized. It applies specifically to:

Doctrine boundary Bright Data protects the public logged-out surface, not authenticated scraping

Inside

Logged-outPublic pageNo session

Outside

AuthenticatedLogin-gatedUser cookies

Logged-out access only. The moment a scraper authenticates with a session token, the ToS attach and the protections of the Bright Data ruling do not apply.
Public-facing surfaces only. Data that requires login to view is not covered. Profile data behind a “view full profile” wall on LinkedIn is not the same as profile data shown to logged-out visitors.
Resale-side facts only. The ruling does not address every CFAA argument, every state-law tort, or every copyright theory that might be advanced against a scraper. It addresses the contract-breach theory specifically.

Practical risk map The protected zone is narrow but commercially important

Posture	Contract risk	Access risk	Buyer demand
Logged-out public pages	Low	Low	Healthy
Logged-in session scraping	High	Medium	Niche
Login-gated data	High	High	Constrained
AI training resale	Mixed	Low	Shifting

Bright Data helps the first row. It does not launder authenticated access, login-gated data, or downstream copyright exposure.

For practical purposes, that is enough to support the major commercial scraping operations targeting public web data — which is why the Apify Store ecosystem and larger commercial vendors like Bright Data operate without legal-existential threat in 2026. But it does not protect every operation. Scrapers that authenticate, scrapers that access non-public surfaces, and scrapers that touch copyrighted content (rather than facts) all face residual exposure that the Chen ruling does not cover.

Why nothing has materially narrowed it

The expected challenges did not arrive. Several plausible scenarios for narrowing the ruling were available to Meta and to other platform operators:

A new ToS that explicitly addresses logged-out scraping (Meta could update the FB/IG terms to attempt to bind anonymous visitors via header-injected acceptance flows). No such update has been published.
An en banc Ninth Circuit appeal of a related case (REX v. Zillow / NAR was the most plausible vehicle). REX lost at the district level on the antitrust theory, narrowing the appellate path; no related case has produced a vehicle large enough to relitigate the contract-formation question Chen decided.
A new state-law theory advanced in California or another permissive jurisdiction. None has surfaced at the level required to threaten the federal-law conclusion.

The most likely explanation for the silence is that Meta and the other platform operators have concluded the litigation cost is not worth the marginal benefit. Bright Data has demonstrated that aggressive defense plus willingness to litigate to summary judgment produces favorable rulings. Other platforms have moved enforcement to non-litigation channels: API restrictions, IP blocks, ToS enforcement against authenticated users only, and structural moves like Zillow’s Listing Access Standards.

That is not a ruling that the scraping side definitively won. It is a tactical retreat by the platform operators from a legal channel that has become unfavorable.

The NYT v. OpenAI docket and what it means

While the scraping-side doctrine held, the AI-training-side doctrine became the more active legal frontier. NYT v. OpenAI advanced past motion to dismiss in March 2025. The court denied OpenAI’s motion to dismiss the main copyright claims, narrowed but preserved the DMCA §1202 claims, and ordered discovery that includes approximately 20 million ChatGPT user logs.

The NYT case is not a scraping case in the narrow sense. The NYT does not allege OpenAI scraped its content directly; it alleges OpenAI trained models on its content (sourced through Common Crawl and other intermediaries) and that the trained models reproduce protected content in user-facing output. The legal theory is downstream of the scraping question.

But the NYT case matters for the scraping market for one specific reason: it tests whether the buyer of scraped data has independent legal exposure for using that data, even if the scraping itself was lawful under the Bright Data doctrine. If the answer is yes — if AI labs face copyright liability for training on scraped public content — then the demand from the largest buyers of scraped data contracts sharply. That is the same dynamic visible in the Reddit licensing era: buyers preferring contracted access over scraped access for liability insulation.

The NYT case has not yet produced a fair-use ruling at the appellate level. Two district-court rulings in the broader AI-training docket — Bartz v. Anthropic and Kadrey v. Meta (partial) — went favorably for AI labs on fair use, but neither is binding outside its district. The 2026 question is whether NYT v. OpenAI produces a clear answer one way or the other.

Buyer mix The legal win preserves public-data scraping, while AI-training buyers become more cautious

Durable demand

Lead generation
Market intelligence
Price monitoring

Public facts, logged-out collection, operational buyers

Contracting demand

AI training corpora
Copyright-heavy archives
Indemnity-sensitive deals

Buyer-side exposure shifts procurement toward licensed feeds

The legal frame for actor strategy

For Apify Store publishers, the legal landscape in 2026 is more favorable than it has been in five years for scraping public web data, and progressively less favorable for selling that data into AI training pipelines.

The implication for actor strategy:

Lead generation, market intelligence, price comparison, real-time competitive monitoring: legally well-protected use cases. The Bright Data doctrine holds. The buyer demand is healthy.

AI training data sales: legally exposed downstream of the scraping itself. The Apify-class actor publisher may not have direct exposure if they are selling raw output, but their buyers increasingly do. That dynamic is suppressing demand for scraped data sold for training.

Authenticated-session scraping: continues to be the highest-risk segment. The Bright Data doctrine does not cover it. Actors that require user-supplied cookies to operate sit outside the legal protections that cover the larger logged-out segment of the market.

The Q1 2026 lead and contact extractors census showed that the dominant positioning phrase in the segment is “No Cookies” — actors that operate without authenticated sessions. That phrase is not just a technical preference. It is the positioning that lines up most cleanly with the legal protections established by the Bright Data ruling. The leaders in the segment have been pricing for legal defensibility as well as technical capability, and that pricing is correct given the case-law evolution of 2024–2026.

The doctrine holds. The buyer mix is shifting. The publishers who position for the durable buyer demand — rather than the contracting AI-training buyer demand — will be the ones who continue to ship through the next legal cycle.

Sources