AI Crawler Index
Live tracker of web crawler activity across the internet. Updated daily with data from Cloudflare Radar, covering billions of requests to millions of websites.
GPTBot
OpenAI
16.21%
Googlebot
22.44%
ClaudeBot
Anthropic
14.65%
Source: Cloudflare Radar — normalized request volumes across Cloudflare's global network
Google's primary web crawler for indexing pages in Google Search. The most active crawler on the web.
Block via robots.txt:
User-agent: Googlebot
OpenAI's web crawler used to train and improve ChatGPT and other models. Respects robots.txt directives.
Block via robots.txt:
User-agent: GPTBot
Anthropic's web crawler used to train Claude AI models. Respects robots.txt directives.
Block via robots.txt:
User-agent: ClaudeBot
Meta's AI training crawler used for Llama models and Meta AI products.
Block via robots.txt:
User-agent: Meta-ExternalAgent
Microsoft's web crawler for indexing pages in Bing Search and powering Copilot answers.
Block via robots.txt:
User-agent: Bingbot
Apple's web crawler supporting Siri, Spotlight, and Apple Intelligence features.
Block via robots.txt:
User-agent: Applebot
Amazon's crawler for Alexa AI and their machine learning services.
Block via robots.txt:
User-agent: Amazonbot
Yandex's web crawler for indexing pages in Russia's largest search engine.
Block via robots.txt:
User-agent: YandexBot
Meta's crawler that fetches page previews when URLs are shared on Facebook and Instagram.
Block via robots.txt:
User-agent: facebookexternalhit
LLM Pulse monitors how AI models mention your brand. See which crawlers visit your site and how that translates into AI visibility.