Cloudflare, one of the largest network infrastructure companies in the world, has announced AI Labyrinth, a new tool to combat web unleashing robots that scratch sites for AI training data without authorization. The company says in A blog article That when it detects the “behavior of inappropriate robots”, the free and opt-in tool attracts robots on a path of links to lure pages generated by AI which “slow down, confuse and waste resources” of those who act in bad faith.
The websites have long used the approach of the Honorary System of Robots.txt, a text file which gives or denies permission to the scrapers, but that the companies of AI, even well known as Anthropic and Perplexity IA, have been accused of having ignored. Cloudflare writes that he sees more than 50 billion requests for web robots per day, and although he has tools to identify and block maliciousness, this often encourages attackers to change tactics in “an endless arms race”.
Cloudflare says rather than blocking bots, Ai Labyrinth fights by making them process data that has nothing to do with the real data of a given website. Society says that it also works as “a new generation honey pot”, drawing in AI robots which continue to follow the links towards deeper false pages, while an ordinary human being would not do it. He indicates that this facilitates the fingerprints of malicious robots for the list of bad actors of Cloudflare as well as to identify “new models and signatures of bot” that he would not have detected otherwise. According to the post, these links should not be visible for human visitors.
You can find out more about how Ai Labyrinth works on Cloudflare’s blog, but here are a little more publication details:
We found that the generation of a various subject of subjects first, then creating content for each subject, produced more varied and convincing results. It is important for us that we do not generate incorrect content which contributes to the spread of disinformation on the Internet, so that the content that we generate is real and linked to scientific facts, simply not relevant or owner for the crawling site.
Website administrators can choose to use AI Labyrinth by accessing the Boots Management section of the Cloudflare dashboard parameters on their site and tilting it. The company says that this “is only the first iteration of the use of generative AI to thwart bots”. He plans to create “whole linked URL networks” in which the bots that find themselves will have trouble shaking as false. As Ars Technica notesAi Labyrinth resembles Nepenthes, a tool designed to put up development robots for “months” in an unwanted data generated by AI.