Cloudflare Blocks AI Crawlers by Default, Launches Pay Per Crawl Model

Introduction

Cloudflare, which services about 20% of websites, announced Tuesday that it is now blocking web-scraping AI bots from accessing those sites by default. Unless a website explicitly turns off the default, an AI crawler will need to obtain permission from the website to scrape its content. Website owners can choose whether they want AI crawlers to access their content, and decide how AI companies can use it, Cloudflare explained in a statement. AI companies can now clearly state their purpose, whether their crawlers are used for training, inference, or search, to help website owners decide which crawlers to allow. GetMyIndia.com

Cloudflare explained that for decades, the internet has operated on a simple exchange: search engines index content and direct users back to original websites, generating traffic and ad revenue for websites of all sizes. This cycle rewards creators who produce high-quality content with financial compensation and a growing following, while helping users discover new and relevant information.

Cloudflare Introduces Pay-Per-Crawl Model

In addition to blocking AI bot scraping by default, Cloudflare also announced Pay Per Crawl, which allows website owners to choose, on an individual basis, to let AI crawlers scrape their site at a set rate, a micropayment for every single “crawl.” “Cloudflare’s primary goal is to help site owners and publishers decide which crawlers can access their content and create the conditions for a market to develop,” Cloudflare’s Head of AI Control, Privacy and Media Products, Will Allen, told TechNewsWorld.

“With the development of Pay Per Crawl,” he said, “Cloudflare is experimenting with a way to help content creators be compensated for their contributions to the AI economy. Pay Per Crawl will let creators control access and get paid, ensuring AI companies can use quality content the right way with permission and compensation.” “Just like ChatGPT charges users fractions of a penny per token, a similar model could be used to compensate websites that opt in to scraping of their content,” he explained. “Handling compensation for creators in an AI-augmented world is a sticky issue,” added Allie Mellen, a senior analyst with Forrester Research, a national market research company headquartered in Cambridge, Mass.

“This is one potential solution; however, it’s unclear how AI providers will handle this cost or if they will look to scrape content elsewhere,” she told TechNewsWorld. “It may also result in a few highly trusted websites being offered compensation per crawl, while others stagnate.” However, Andy Jung, associate counsel for TechFreedom, a technology advocacy group in Washington, D.C., argued that AI companies may settle for the Pay Per Crawl scheme without much resistance to ensure they don’t get accused of “pirating” content, as Anthropic was in the Bartz v. Anthropic case. “AI companies might agree to pay to crawl websites just to avoid website owners analogizing unpaid crawling to pirating, thereby casting a shadow of doubt over the data AI companies use to train their models,” he said.

Publisher Controls & Bot Verification

Cloudflare gives publishers fine-grained control over AI bot access. Site owners can choose to allow, block, or charge specific bots. Bots must register, provide public keys, and use cryptographic signatures (Ed25519) to verify their identity. This ensures only approved, authenticated crawlers access content. Publishers can whitelist trusted bots or reject unknown ones. Cloudflare also collaborates with AI companies to disclose crawler intent (e.g., training vs. search), helping sites make informed choices. This system ensures security, transparency, and empowers publishers to manage AI access on their terms.

Potential Big Deal

Greg Sterling, co-founder of Near Media, a market research firm based in San Francisco, argued that Cloudflare’s move is “potentially a big deal,” as the company powers about 20% of the internet and a third of the higher-profile sites. “It’s an effort to reclaim power and give publishers control over whether and how their content is used by AI, and it seeks to compensate publishers in a time of declining traffic and clicks, which puts their business models at risk,” he said. “But it may ultimately not have a significant impact on AI.” “It remains to be seen how many sites choose to use this,” he said. “There’s a potential FOMO [fear of missing out] problem or prisoner’s dilemma that advantages the AI companies: ‘If I’m not there, my competitors will be.’ “Yet, it’s still an important step that potentially shifts the terms of debate and power dynamics between content publishers and AI platforms,” he added.

In Cloudflare’s statement, it listed more than 50 companies supporting a permission-based model for AI web crawling, including Adweek, The Associated Press, The Atlantic, BuzzFeed, Condé Nast, Fortune, Gannett Media, O’Reilly Media, Pinterest, Reddit, Sky News Group, Snopes, Time, Universal Music Group, and Ziff Davis.

Mark N. Vena, president and principal analyst at SmartTech Research in Las Vegas, maintained that permission-based AI web crawling could be a significant curveball for AI companies, especially those relying on scraping massive amounts of web data to train their models. “If large swaths of the internet go dark to bots overnight, it limits the diversity and freshness of the training data,” he said. “Big players might pivot to more licensing deals, but smaller startups could be left scrambling.”

Rob Enderle, president and principal analyst of the Enderle Group, an advisory services firm in Bend, Ore., noted that Cloudflare’s permissions play will significantly affect both established and new market players. “For existing AIs that already have their training sets, this will reduce their ability to remain current,” he told TechNewsWorld. “For new AIs, it will potentially reduce their initial training sets, making the result less performant.” “It also looks like they are getting creative with how to deal with AI revenue loss and what many believe is data theft,” he added. “This effort is early yet, and I expect it will evolve significantly over the years, but it is an impressive initial start.”

Why This Matters

AI bots increasingly extract content without driving traffic back to creators, hurting ad revenue and visibility. For instance, Google AI Overviews and tools like GPTBot can deliver answers without users visiting source websites. This leads to lost traffic, fewer referrals, and declining monetization for publishers. Cloudflare’s default block and Pay Per Crawl model aim to rebalance the ecosystem, giving content owners the ability to protect, charge for, or control access to their data. It empowers publishers to set terms, monetize AI usage, and ensure AI development doesn’t come at their expense.

The Road Ahead

Cloudflare plans to expand Pay Per Crawl with advanced features like dynamic pricing, where fees vary by content type or value, and purpose-based access, allowing differentiated pricing for training, search, or inference. Future updates may enable crawler-agent negotiations, automated licensing, and more granular controls. As more AI companies register bots and respect these protocols, the model could become a new standard for ethical and compensated web scraping. Cloudflare aims to create a sustainable, transparent ecosystem where AI innovation coexists with fair compensation for content creators.

Cloudflare Blocks AI Crawlers by Default, Launches Pay Per Crawl Model
Tagged on:             

Leave a Reply

Your email address will not be published. Required fields are marked *