Using robots.txt filesWhat is a robots.txt file?Robots.txt files allow you to inform search engine crawlers and bots what URLs of your website should or should not be accessed via the Robots Exclusion Protocol. You can use it in combination with a sitemap.xml file and Robots meta tags for more granular control over what parts of your website get crawled. The robots.txt file should be located at the root directory of the website. Important: The rules in the robots.txt file rely on voluntary compliance of the crawlers and bots visiting your website. If you wish to fully block access to specific pages or files of your website or prevent specific bots from accessing your website, you should consider using an .htaccess file instead. Various examples on applying such restrictions are available in our How to use .htaccess files article. How to create a robots.txt file?Some applications, like Joomla, will come with a robots.txt file by default, while others, like WordPress, may generate the robots.txt file dynamically. Dynamically generated robots.txt files do not exist on the server, so editing them will depend on the specific software application you are using. For WordPress, you can use a plugin that handles the default robots.txt file or manually create a new robots.txt file. You can create a robots.txt file for your website via the File Manager section of the hosting Control Panel. Alternatively, you can create a robots.txt file locally in a text editor of your choice, and after that, you can upload the file via an FTP client. You can find step-by-step instructions on how to set up the most popular FTP clients in the Uploading files category from our online documentation. What can you use in the robots.txt file?The rules in the robots.txt file are defined by directives. The following directives are supported for use in a robots.txt file:
The values for the directives are case-sensitive, so you need to make sure you enter them with the correct capitalization. For example, two different bots will be targeted if you use "Googlebot" and "GoogleBot" as User-Agents in your robots.txt file. Comments in the robots.txt fileYou can use the # (hashtag) character to add comments for better readability by humans. Characters after a # character will be considered a comment if the character is at the start of a line or after a properly defined directive followed by an interval. Examples of valid and invalid comments can be found below: // This is not a valid comment. What is the default content of the robots.txt file?The content of the robots.txt file will depend on your website and the applications/scripts you are using on it. By default, all User-Agents are allowed to access all pages of your website unless there is a custom robots.txt file with other instructions. JoomlaYou can find the default content of the robots.txt for Joomla in its official documentation. It looks like this: WordPressThe default WordPress robots.txt file has the following content: ExamplesYou can find sample uses of robots.txt files listed below:
How to inform robots they can fully access your website?If there is no robots.txt file available for your website, the default rule for your website will be to allow all User-Agents to access all pages of your domain: The same can be achieved by specifying an empty Disallow directive: How to inform all robots they should not access your website?To disallow all robots access to your website, you can use the following content in your website's robots.txt file: How to inform robots they can fully access your website except for a specific directory/file?You can inform bots to crawl your website with the exception of a specific directory or a file using this code block in your robots.txt file: How to allow full access to your website to all robots except for a specific one?If you would like to allow all robots except just one to crawl your website, you can use this code block in your website's robots.txt file: How to allow full access to your website only to a single robot?To instruct all bots but one to not crawl the website, use this code block: User-agent: AllowedBot How to disallow robots from accessing specific file types?Adding the following code block to your robots.txt file will instruct compliant robots to not crawl .pdf files: How to disallow robots from accessing URLs with a dollar sign?The $ (dollar sign) character has a special meaning in robots.txt files - it defines the end of a matching value in the Allow/Disallow directive. To add a rule that takes effect for URLs that contain the $ character, you can use the following pattern in the Allow/Disallow directive: |