2024/07/29

Robots.txt Can’t Stop Relentless AI Web Scrapers

“The ecosystem of agents is changing quickly, so it’s basically impossible for website owners to manually keep up.”
– The anonymous entity behind Dark Visitors, describing how protecting web content from training data-hungry AI companies is a losing battle. In a check-in on the state of robots.txt, journalist Jason Koebler outlines how the once reliable protocol used to protect web content from bots is not working anymore due to industrial-scale scraping operations by AI companies like Anthropic.
Metadata: Contributors:
$40 USD