Technical accessibility is the degree to which AI search crawlers can access, fetch and interpret a website's content — a prerequisite for any AI Search visibility, regardless of content quality or entity clarity.
Why it matters
AI search visibility starts with access. If a system cannot crawl, fetch or interpret your content, it cannot reliably cite it — regardless of how well the content is structured or how strong the entity signals are.
Not every AI crawler has the same purpose. Understanding the distinction between search crawlers and model training crawlers is essential before making robots.txt decisions that affect AI Search visibility.
AI crawlers — search vs training
OAI-SearchBot
ChatGPT Search discoveryUsed by OpenAI to surface websites in ChatGPT Search features. Allowing this crawler enables your content to appear as a cited source in ChatGPT Search results.
→ Allow if AI Search visibility is your goal
GPTBot
OpenAI model trainingUsed by OpenAI for potential model training data collection. Independent from OAI-SearchBot — blocking GPTBot does not affect ChatGPT Search visibility.
→ Evaluate separately based on your training data policy
PerplexityBot
Perplexity search indexingUsed by Perplexity to surface websites in search results. Per Perplexity documentation, it respects robots.txt and does not use blocked content for pre-training foundation models.
→ Allow if Perplexity visibility is your goal
ClaudeBot
Anthropic crawlingUsed by Anthropic. Check current documentation for crawl purpose and robots.txt configuration.
→ Review Anthropic documentation for current guidance
robots.txt configuration example
# Allow ChatGPT Search crawler
User-agent: OAI-SearchBot
Allow: /
# Block model training crawler
User-agent: GPTBot
Disallow: /
# Allow Perplexity search crawler
User-agent: PerplexityBot
Allow: /
llms.txt
llms.txt is a plain-text file placed at the root of a website that signals to AI agents which content is authoritative and how to interpret the site structure. It is a proposed standard — not a confirmed universal requirement for AI Search visibility — but it is a low-effort addition worth deploying alongside robots.txt, sitemap.xml, schema markup and strong internal linking.
Implementation checklist
- →
Check robots.txt — confirm OAI-SearchBot and PerplexityBot are not blocked
- →
Evaluate GPTBot separately based on your organization's model training data policy
- →
Verify that core content pages are available as indexable HTML — not hidden behind JavaScript or login walls
- →
Ensure canonical URLs are clean and consistent — duplicate content creates retrieval ambiguity
- →
Monitor Core Web Vitals — performance affects crawlability and overall discoverability
- →
Implement llms.txt as an experimental AI-readiness layer alongside sitemap.xml and robots.txt
- →
Verify sitemap.xml is submitted to Google Search Console and up to date