AI companies' content crawlers are saddling website owners with unexpected, crushing bandwidth charges
AI bots that crawl websites and scrape their content are becoming more difficult to block and it's costing websites thousands of dollars.
By looting large amounts of data without permission, AI bot crawlers can leave smaller sites (that make and write things) on the hook for big unexpected bandwidth charges.
One coding documentation site said a crawler accessed 73 terabytes of files in May, including 10 terabytes in a single day, costing it more than $5,000 in bandwidth fees.
The problem: AI-focused companies keep making new crawlers that companies haven't blocked yet, and some appear to be bypassing attempts to block their bots altogether. Last month, OpenAI and Anthropic were accused of ignoring attempts to block their crawlers. Wired reporting found that Jeff Bezos and Nvidia-backed AI search startup Perplexity AI has similarly ignored scraping blockers, prompting an investigation from Amazon's cloud division.
One coding documentation site said a crawler accessed 73 terabytes of files in May, including 10 terabytes in a single day, costing it more than $5,000 in bandwidth fees.
The problem: AI-focused companies keep making new crawlers that companies haven't blocked yet, and some appear to be bypassing attempts to block their bots altogether. Last month, OpenAI and Anthropic were accused of ignoring attempts to block their crawlers. Wired reporting found that Jeff Bezos and Nvidia-backed AI search startup Perplexity AI has similarly ignored scraping blockers, prompting an investigation from Amazon's cloud division.