Cloudflare’s New Settings for AI Bots Simplify Protections Against Content Scraping
July 9, 2025
Cloudflare recently introduced an “Easy button” setting that will screen out all attempted content scraping, even for users of its free tier of services, and is promising a “pay to scrape” feature in the near future that would demand a toll from AI bots.
Cloudflare has made it easier for its clients to block the AI bots that engage in content scraping for training material, and it is promising a new “pay to scrape” system that has potential to seriously impact the whole AI training industry.
The widely-used CDN and DDoS protection service recently introduced an “Easy button” setting that will screen out all attempted content scraping, even for users of its free tier of services, and is promising a “pay to scrape” feature in the near future that would demand a toll from AI bots when they approach a site looking to harvest training data.
More aggressive behavior from AI bots as free troves of content dry up
The new setting is found under the security menu options for all Cloudflare users, including the most basic “free” tier. New users will be prompted to choose a setting when they set up their new accounts, but existing users will have to go enable it manually.
The move comes as major sources of the reams of content AI models need to train on, such as Reddit and the world’s major newspapers, lock themselves off from content scraping and require substantial payments for usage rights. The law remains unclear in terms of copyright violations and usage rights, but this has not deterred all the major AI players from grabbing everything available with their constantly roaming AI bots and claiming “fair use” as a legal defense.
Suits from the likes of the BBC and New York Times will likely establish firm standards regarding AI content scraping when they are settled, but some of the key cases have been going on for years now and it is unclear when these rulings will emerge. In the meantime, the “pay to scrape” concept has potential to upend the market. It would extend the same protections and terms that Reddit and social media sites have to the web of smaller and more poorly-defended sites that AI bots are turning to. With the tap of free content shut off to these AI models, some may actually end up being forced to shut their doors due to lack of economic viability.
Most content scraping comes from the world’s largest AI models
The impact of “pay to scrape” will likely depend on adoption of the concept by more than just Cloudflare; other major CDNs like Akamai and Fastly adding this would be a game-changer, along with social media platforms and commonly-used content management systems like WordPress. One of the central issues for web properties is that the major sources of content scraping are often tied up with the major search engine providers that they rely on for traffic, meaning that “robots.txt” rules had to be carefully configured to allow the search crawlers but deny the AI bots. Data collected by Cloudflare indicates that websites are overwhelmingly choosing to block content scraping in all its forms, with only 10% of the top one million sites that it protects opting to allow these bots through.
Though nearly all of the big AI firms are guilty of widespread content scraping, Cloudflare notes four that are the most active: those belonging to ByteDance, Amazon, Anthropic and OpenAI. But as Cloudflare notes, the limited number of players and the similar tools and tactics they use (for things like robots.txt evasion) has produced common patterns that make AI bots and content scraping fairly easy to track. This in turn has produced a body of intelligence over the last few years that makes it very possible to identify when AI bots change their tactics and adopt new tools.



