Cloudflare Now Blocks AI Web Scraping by Default

July 2, 2025

45

Cloudflare, one of the world’s largest internet infrastructure providers, has begun blocking AI web crawlers by default unless they receive direct permission from site owners.

This new policy changes the longstanding practice where AI developers could freely scrape the web to train large language models (LLMs).

A Default Block on AI Crawling

Previously, Cloudflare allowed website owners to opt out of AI crawling. Now, blocking is automatic. This reversal comes after more than 1 million customers chose to restrict AI bots under the former optional system.

As of now, AI vendors must explicitly seek permission to access content, including clarifying whether their intent is training, inference or search.

“This long-awaited feature by Cloudflare is a true disaster for many GenAI vendors, which may be fatal to the current business models of GenAI,” said Dr Kolochenko, CEO at ImmuniWeb and a Fellow at the British Computer Society (BCS).

“This security feature will elegantly prevent data-greedy bots from unwarrantedly scraping human-created content without permission and without paying for it.”

A New Economic Model for Web Content

The updated policy introduces a “Pay Per Crawl” program. This feature lets a select group of publishers set pricing terms for AI scrapers. In return, AI companies can choose to pay for content access or be denied entry. This permission-based approach contrasts with the previous model, where web scraping relied on loosely enforced rules, such as robots.txt.

During the Axios Live event last week, Cloudflare CEO, Matthew Prince, emphasized the broader implications.

“If the internet is going to survive the age of AI, we need to give publishers the control they deserve and build a new economic model that works for everyone,” Prince explained.

“In sum, most GenAI vendors will soon face a tough reality: paying a fair price for high-quality training data while staying profitable. In view of the formidable competition emanating from China, many Western GenAI companies may simply quit the business as economically unviable,” Kolochenko added.

Legal Gray Areas and Social Media Exemptions

The legality of scraping remains murky. In May 2025, Irish and German regulators declined to block Meta from using Facebook and Instagram data to train its Llama model, despite opposition from privacy and consumer groups. These developments highlight the gap between fast-moving technologies and slower regulatory systems.

“In some jurisdictions, a deliberate bypass of anti-bot protection and massive data scraping may constitute a criminal offense,” Kolochenko said, adding that breach of contract claims, not copyright, could pose the most serious legal threat to GenAI companies.

Cloudflare Now Blocks AI Web Scraping by Default

A Default Block on AI Crawling

A New Economic Model for Web Content

Legal Gray Areas and Social Media Exemptions

Flaw in Slider Revolution Plugin Exposed 4m WordPress Sites

The case for early threat prevention – Sophos News

Hackers Target ScreenConnect Features For Network Intrusions

LEAVE A REPLY Cancel reply

Most Popular

Flaw in Slider Revolution Plugin Exposed 4m WordPress Sites

Microsoft faces £2bn UK lawsuit over cloud overcharging claims

Mercedes-Benz starts electric drive production for the new electric GLC

Optimizing food subsidies: Applying digital platforms to maximize nutrition | MIT News

Recent Comments

ABOUT US

POPULAR POSTS

Flaw in Slider Revolution Plugin Exposed 4m WordPress Sites

Microsoft faces £2bn UK lawsuit over cloud overcharging claims

Mercedes-Benz starts electric drive production for the new electric GLC

POPULAR CATEGORY