Skip to main content

    Robots.txt Generator

    Generate a robots.txt file for your website. Control search engine crawling with an easy visual builder.

    Free to use. Runs in your browser.

    Configure your crawling rules below and copy or download the generated robots.txt file.

    Crawling Rules

    Generated robots.txt

    User-agent: *
    Allow: /
    Disallow: /admin
    Disallow: /api
    Disallow: /login
    
    Sitemap: https://example.com/sitemap.xml

    What Is robots.txt?

    The robots.txt file is a plain text file at your website's root that tells search engine crawlers which pages they're allowed to visit and which they should skip. It's the first file a well-behaved crawler checks before accessing any page on your site.

    Think of it as a "staff only" sign for web crawlers. It's a polite request, not a security measure, crawlerschoose to respect it. Googlebot and Bingbot follow the rules reliably. Malicious bots and scrapers typically ignore it entirely. Never use robots.txt as your only defence for sensitive content.

    Directive Reference

    DirectiveSyntaxWhat It Does
    User-agentUser-agent: *Specifies which crawler the rules apply to (* = all)
    AllowAllow: /public/Permits crawling of a specific path
    DisallowDisallow: /admin/Blocks crawling of a specific path
    SitemapSitemap: https://…/sitemap.xmlPoints crawlers to your XML sitemap
    Crawl-delayCrawl-delay: 10Requests N seconds between requests (not respected by Google)

    What this means for you: Google ignores Crawl-delay, use Search Console's crawl rate settings instead. Bing and Yandex do respect it. The Sitemap directive is the most important after Allow/Disallow.

    Common robots.txt Patterns

    Scenariorobots.txtNotes
    Allow everythingUser-agent: * Allow: /Default for most sites, let crawlers see everything
    Block everythingUser-agent: * Disallow: /Staging/dev sites only, never do this in production
    Block admin pathsDisallow: /admin/ Disallow: /api/Standard security hygiene, don't expose backend routes
    Block a specific botUser-agent: AhrefsBot Disallow: /Blocks aggressive SEO crawlers that waste bandwidth

    Common Mistakes

    Blocking CSS and JS Files

    Google needs to render your pages to understand them. Blocking CSS/JS files in robots.txt prevents rendering and can hurt your rankings. Only block truly private resources.

    Using robots.txt for Security

    Disallow doesn't hide pages, it just asks crawlers not to visit them. The URLs are still visible in the file. Use authentication and proper access controls for sensitive content.

    Forgetting to Update After Redesign

    Site redesigns often change URL structures. If your old robots.txt blocks paths that are now important, those pages won't get crawled. Review robots.txt after every major change.

    Confusing Disallow with Noindex

    Disallow prevents crawling. Noindex prevents indexing. A page blocked by robots.txt can still appear in search results if other sites link to it. Use noindex meta tags to prevent indexing.

    AI Crawlers You Should Know About

    Bot NameCompanyUser-AgentWhat It Does
    GPTBotOpenAIGPTBotCrawls content for training ChatGPT models
    Google-ExtendedGoogleGoogle-ExtendedTraining data for Gemini/Bard AI models
    CCBotCommon CrawlCCBotOpen dataset used by many AI companies
    anthropic-aiAnthropicanthropic-aiCrawls for Claude model training
    ClaudeBotAnthropicClaudeBotWeb browsing for Claude responses

    To block all AI training crawlers, add User-agent: GPTBot and Disallow: / blocks for each bot. Blocking search engine crawlers (Googlebot, Bingbot) is a separate decision, those affect your search rankings, not AI training. You can block AI training while keeping your search presence.

    Related Tools

    How to use this tool

    1

    Enter your sitemap URL and toggle common blocking rules (admin, API, login)

    2

    Add custom disallow paths for any additional routes to block

    3

    Copy or download the generated robots.txt file and upload to your site root

    Common uses

    • Blocking admin and login pages from search engine crawling
    • Preventing API endpoints from appearing in search results
    • Setting up robots.txt for new website deployments
    • Adding sitemap references for improved search engine discovery

    Share this tool

    Frequently Asked Questions

    What is robots.txt?
    A robots.txt file is a plain text file at your website root that tells search engine crawlers which pages they can or cannot access. It's the first file well-behaved crawlers check.
    Where should I place robots.txt?
    The file must be at the root of your domain: https://example.com/robots.txt. Any other location will be ignored by crawlers.
    Does robots.txt block pages from Google?
    It prevents crawling, but pages may still appear in search results if other sites link to them. Use noindex meta tags to prevent both crawling and indexing.
    Should I block /admin pages?
    Yes. Blocking admin, login, API, and internal tool routes from crawling is standard security hygiene. It prevents these paths from appearing in search results.
    Is robots.txt a security measure?
    No. It's a polite request that legitimate crawlers choose to respect. Malicious bots ignore it entirely. Never rely on robots.txt to hide sensitive content, use proper authentication instead.
    What does Disallow: / mean?
    It blocks all crawling of your entire site. Only use this for staging or development environments. In production, this would prevent your site from appearing in search results.
    Does Google respect the Crawl-delay directive?
    No. Google ignores Crawl-delay, use Google Search Console's crawl rate settings instead. Bing and Yandex do respect the directive.
    Should I reference my sitemap in robots.txt?
    Yes. Adding 'Sitemap: https://yoursite.com/sitemap.xml' at the end of robots.txt helps all crawlers discover your sitemap, not just those you've submitted it to directly.
    Can I block specific bots?
    Yes. Use 'User-agent: BotName' followed by 'Disallow: /' to block a specific crawler. Common targets include aggressive SEO bots like AhrefsBot or SemrushBot that waste bandwidth.
    What's the difference between Allow and Disallow?
    Disallow blocks crawling of a path. Allow explicitly permits crawling, useful to override a broader Disallow. For example, Disallow: /private/ then Allow: /private/public-page.
    Can I use wildcards in robots.txt?
    Google and Bing support * and $ wildcards. * matches any sequence of characters. $ matches the end of the URL. For example: Disallow: /*.pdf$ blocks all PDF files.
    Is my generated robots.txt stored anywhere?
    No. The file is generated in your browser and never leaves your device. Copy it or download to save it.

    Results are for general informational purposes only and should be checked before use. They are not professional advice. See our Disclaimer and Terms of Service.