Free Tool · Generator
Robots.txt Generator
Build a clean robots.txt — including AI bots like GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and CCBot.
Welcome every bot — search, AI training, AI search, and social previews. The default and recommended posture for content sites that want maximum reach.
Use * for the catch-all rule, or pick from the AI bot list below.
Optional. Most modern crawlers ignore Crawl-delay; Yandex still uses it.
Absolute URLs only — e.g., https://example.com/sitemap.xml.
Non-standard. Only Yandex respects this; Google ignores it.
1 hint
- No Sitemap declared. Adding one helps Google + Bing discover all your pages — it should be the absolute URL to your sitemap.xml.
robots.txt
txtPlace at the root of your domainUser-agent: *
Allow: /
Upload your robots.txt to your domain root, then test it with Google's robots.txt tester in Search Console.
AI + search crawler reference (65 bots)
Owner, purpose, and category for every AI / search bot we track.
▼
AI + search crawler reference (65 bots)
Owner, purpose, and category for every AI / search bot we track.
Allowing AI bots ≠ getting cited by AI.
Letting GPTBot in is step one. Step two is structuring your content so it actually gets cited. Rankrize generates llms.txt, ai.txt, FAQ schema, and answer-capsule content built for AI citation. Run a free GEO audit.
How to use
How to use the Robots.txt Generator
Decide your AI posture before you decide your rules
Three options: (1) Allow all — best for content sites that want maximum reach including AI citations. (2) Allow search, block AI training — keep ranking on Google but opt out of AI model training. (3) Block all AI — keep ranking but never appear in ChatGPT, Claude, or Perplexity answers. Start with the matching template, then customize.
Allowing GPTBot ≠ getting cited by ChatGPT
Robots.txt is the door — content quality is the conversation. Allowing GPTBot lets OpenAI train on your pages; getting actually cited requires your content to be discoverable, well-structured (FAQ schema, HowTo schema, clean answer capsules), and authoritative. The Rankrize platform is built around this — schema generation + answer-capsule writing on every article.
Always include your sitemap URL — even if you also link it from <head>
Search engines discover sitemaps from three places: the robots.txt Sitemap directive, Search Console submissions, and HTML <link rel="sitemap"> tags. Robots.txt is the most universal — Google, Bing, Yandex, DuckDuckGo, and AI search engines all read it. Use absolute URLs (https://example.com/sitemap.xml), not relative paths.
Most-specific match wins — order doesn't matter
Per RFC 9309, when multiple rules apply to the same crawler, the most specific path match wins (longest matching pattern). Order in the file is irrelevant. So 'Disallow: /admin/' followed later by 'Allow: /admin/help' will let Googlebot read /admin/help even though /admin/ is broadly disallowed.
Test before deploying — and re-test after every change
Once your robots.txt is live, test it with Google Search Console's robots.txt tester (link in the validation panel). Paste a few sample URLs, pick a user-agent, and confirm it returns the expected allow / disallow. A typo can deindex your entire site overnight — verifying takes 30 seconds.
Don't use robots.txt to hide private pages
Disallow tells crawlers not to fetch a URL — it doesn't tell them not to index it. If other sites link to your /private-page, Google can still list it (with a thin snippet). For real privacy, use noindex meta tag on the page OR put it behind authentication. Robots.txt is for crawl-budget management, not access control.
Frequently asked questions
About the Robots.txt Generator
More free tools
The full Rankrize free toolkit
Schema Markup Generator
Generate validated JSON-LD schema markup for 9 types — paste-ready for your site.
Google SERP Preview
See how your title and description render on Google desktop, mobile, and as a social card — with real pixel-width truncation.
Core Web Vitals Checker
Real Core Web Vitals from Google PageSpeed Insights — LCP, INP, CLS, FCP, TTFB. Lab + field data, mobile + desktop.
Meta Tag Generator
Build a complete, validated set of HTML meta tags — title, description, robots, canonical, Open Graph, Twitter cards, and more.
Robots.txt Generator
Build a clean robots.txt — including AI bots like GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and CCBot.
SEO + AI Search Checklist
A modern 50-item SEO checklist covering technical, on-page, content, and AI-search (GEO) — with progress saved in your browser.