Skip to content
AI SEO High severity

AI Bot Directives

What are AI bot directives and how do you control which AI crawlers access your content?

By eiSEO Team · Published Jun 15, 2025 · Updated Feb 27, 2026

What is ai bot directives?

AI bot directives are rules you set in robots.txt and meta robots tags to control how AI company crawlers (GPTBot, Google-Extended, ClaudeBot, Bytespider, and others) access and use your content. These directives let you decide whether your pages can be crawled for AI training data, used in AI search results, or blocked entirely from specific AI systems.

AI bot directives are rules set in robots.txt and meta robots tags to control how AI company crawlers such as GPTBot, Google-Extended, ClaudeBot, and PerplexityBot access and use your content. They let you decide whether pages can be crawled for AI training data, used in AI search results, or blocked entirely from specific AI systems.

Why does ai bot directives matter?

As AI search engines become major traffic sources, controlling AI bot access is a critical strategic decision. Blocking all AI crawlers means your content will not appear in AI-generated answers, potentially costing significant traffic. Allowing all AI crawlers gives away your content for training without guaranteed attribution. The right approach is selective — allow AI search crawlers that drive traffic and cite sources while potentially restricting pure training crawlers that do not send users back to your site.

Key statistics

As of 2024, over 35% of the top 1,000 websites have added specific AI bot directives to their robots.txt files.

Source: Originality.ai

AI-driven search engines now account for an estimated 10-15% of referral traffic for content-heavy websites.

Source: SparkToro

How to fix it

  1. 1

    Identify which AI crawlers are accessing your site by reviewing your server logs for user agents like GPTBot, ClaudeBot, Google-Extended, Bytespider, CCBot, and PerplexityBot.

  2. 2

    Add explicit Allow or Disallow rules for each AI bot in your robots.txt file based on your content strategy and whether you want to appear in their AI answers.

  3. 3

    Consider allowing AI search crawlers (GPTBot for ChatGPT search, PerplexityBot for Perplexity) that cite and link to sources while evaluating pure training crawlers on a case-by-case basis.

  4. 4

    Use meta robots or X-Robots-Tag headers for page-level AI bot control when you want different rules for different sections of your site.

  5. 5

    Monitor the evolving AI crawler landscape — new bots appear frequently, and their behavior (search vs. training) may change.

Code example

Bad
# robots.txt — blocks ALL bots including AI search crawlers
User-agent: *
Disallow: /

# You are invisible to both traditional and AI search engines
Good
# robots.txt — strategic AI bot management
User-agent: GPTBot
Allow: /blog/
Allow: /learn/
Disallow: /app/

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: CCBot
Disallow: /

Frequently asked questions

It depends on your strategy. Blocking AI search crawlers (GPTBot, PerplexityBot) means your content will not appear in AI-generated answers, costing traffic. Consider selective blocking — allow AI search crawlers that cite sources and block pure training crawlers.
GPTBot is OpenAI's crawler used for both ChatGPT search and model training. Google-Extended controls Google's use of your content for AI features like Gemini but does not affect regular Google Search indexing.
Major AI crawlers from OpenAI, Google, Anthropic, and Perplexity have committed to respecting robots.txt. However, compliance is voluntary and not all crawlers may honor your directives.

Check your AI search visibility

eiSEO automatically detects and helps you fix issues like this across your entire site.