Robots.txt Generator

Generate a robots.txt file for your website. Control search engine crawling with an easy visual builder.

Free to use. Runs in your browser.

Configure your crawling rules below and copy or download the generated robots.txt file.

Save the file as robots.txt at your domain root (example.com/robots.txt); any other location is ignored by crawlers.

Sitemap URL

Crawling Rules

Allow all pages

Block /admin

Block /api

Block /login

Block /search

Block /images

Custom Disallow Paths (one per line)

Crawl Delay (seconds, 0 = none)

Generated robots.txt

User-agent: *
Allow: /
Disallow: /admin
Disallow: /api
Disallow: /login

Sitemap: https://example.com/sitemap.xml

What Is robots.txt?

The robots.txt file is a plain text file at your website's root that tells search engine crawlers which pages they're allowed to visit and which they should skip. It's the first file a well-behaved crawler checks before accessing any page on your site.

Think of it as a "staff only" sign for web crawlers. It's a polite request, not a security measure, crawlerschoose to respect it. Googlebot and Bingbot follow the rules reliably. Malicious bots and scrapers typically ignore it entirely. Never use robots.txt as your only defence for sensitive content.

Directive Reference

Directive	Syntax	What It Does
User-agent	User-agent: *	Specifies which crawler the rules apply to (* = all)
Allow	Allow: /public/	Permits crawling of a specific path
Disallow	Disallow: /admin/	Blocks crawling of a specific path
Sitemap	Sitemap: https://…/sitemap.xml	Points crawlers to your XML sitemap
Crawl-delay	Crawl-delay: 10	Requests N seconds between requests (not respected by Google)

What this means for you: Google ignores Crawl-delay, use Search Console's crawl rate settings instead. Bing and Yandex do respect it. The Sitemap directive is the most important after Allow/Disallow.

Common robots.txt Patterns

Scenario	robots.txt	Notes
Allow everything	User-agent: * Allow: /	Default for most sites, let crawlers see everything
Block everything	User-agent: * Disallow: /	Staging/dev sites only, never do this in production
Block admin paths	Disallow: /admin/ Disallow: /api/	Standard security hygiene, don't expose backend routes
Block a specific bot	User-agent: AhrefsBot Disallow: /	Blocks aggressive SEO crawlers that waste bandwidth

Common Mistakes

Blocking CSS and JS Files

Google needs to render your pages to understand them. Blocking CSS/JS files in robots.txt prevents rendering and can hurt your rankings. Only block truly private resources.

Using robots.txt for Security

Disallow doesn't hide pages, it just asks crawlers not to visit them. The URLs are still visible in the file. Use authentication and proper access controls for sensitive content.

Forgetting to Update After Redesign

Site redesigns often change URL structures. If your old robots.txt blocks paths that are now important, those pages won't get crawled. Review robots.txt after every major change.

Confusing Disallow with Noindex

Disallow prevents crawling. Noindex prevents indexing. A page blocked by robots.txt can still appear in search results if other sites link to it. Use noindex meta tags to prevent indexing.

AI Crawlers You Should Know About

Bot Name	Company	User-Agent	What It Does
GPTBot	OpenAI	GPTBot	Crawls content for training ChatGPT models
Google-Extended	Google	Google-Extended	Training data for Gemini/Bard AI models
CCBot	Common Crawl	CCBot	Open dataset used by many AI companies
anthropic-ai	Anthropic	anthropic-ai	Crawls for Claude model training
ClaudeBot	Anthropic	ClaudeBot	Web browsing for Claude responses

To block all AI training crawlers, add User-agent: GPTBot and Disallow: / blocks for each bot. Blocking search engine crawlers (Googlebot, Bingbot) is a separate decision, those affect your search rankings, not AI training. You can block AI training while keeping your search presence.

Related Tools

Sitemap Generator

Generate the sitemap.xml referenced in your robots.txt.

Meta Tag Generator

Add noindex tags to pages that need more than robots.txt blocking.

Schema Markup Generator

Add structured data to pages that crawlers are allowed to access.

Google SERP Preview

Preview how your crawlable pages appear in search results.

User Agent Parser

Identify the bot user agents you want to control in robots.txt.

HTTP Status Code Lookup

Understand status codes in your crawl logs and error reports.

How to use this tool

Enter your sitemap URL and toggle common blocking rules (admin, API, login)

Add custom disallow paths for any additional routes to block

Copy or download the generated robots.txt file and upload to your site root

Common uses

Blocking admin and login pages from search engine crawling
Preventing API endpoints from appearing in search results
Setting up robots.txt for new website deployments
Adding sitemap references for improved search engine discovery

Share this tool

Frequently Asked Questions

What is robots.txt?

A robots.txt file is a plain text file at your website root that tells search engine crawlers which pages they can or cannot access. It's the first file well-behaved crawlers check.

Where should I place robots.txt?

The file must be at the root of your domain: https://example.com/robots.txt. Any other location will be ignored by crawlers.

Does robots.txt block pages from Google?

It prevents crawling, but pages may still appear in search results if other sites link to them. To prevent indexing, add a noindex meta tag to the page, but make sure crawling is allowed so Google can actually read the tag. Blocking a page with robots.txt and adding noindex achieves nothing, because Google cannot see a tag it is not permitted to crawl.

Should I block /admin pages?

Yes. Blocking admin, login, API, and internal tool routes from crawling is standard security hygiene. It prevents these paths from appearing in search results.

Is robots.txt a security measure?

No. It's a polite request that legitimate crawlers choose to respect. Malicious bots ignore it entirely. Never rely on robots.txt to hide sensitive content, use proper authentication instead.

What does Disallow: / mean?

It blocks all crawling of your entire site. Only use this for staging or development environments. In production, this would prevent your site from appearing in search results.

Does Google respect the Crawl-delay directive?

No. Google ignores Crawl-delay, use Google Search Console's crawl rate settings instead. Bing and Yandex do respect the directive.

Should I reference my sitemap in robots.txt?

Yes. Adding 'Sitemap: https://yoursite.com/sitemap.xml' at the end of robots.txt helps all crawlers discover your sitemap, not just those you've submitted it to directly.

Can I block specific bots?

Yes. Use 'User-agent: BotName' followed by 'Disallow: /' to block a specific crawler. Common targets include aggressive SEO bots like AhrefsBot or SemrushBot that waste bandwidth.

What's the difference between Allow and Disallow?

Disallow blocks crawling of a path. Allow explicitly permits crawling, useful to override a broader Disallow. For example, Disallow: /private/ then Allow: /private/public-page.

Can I use wildcards in robots.txt?

Google and Bing support * and $ wildcards. * matches any sequence of characters. $ matches the end of the URL. For example: Disallow: /*.pdf$ blocks all PDF files.

Is my generated robots.txt stored anywhere?

No. The file is generated in your browser and never leaves your device. Copy it or download to save it.

Results are for general informational purposes only and should be checked before use. They are not professional advice. See our Disclaimer and Terms of Service.