Robots.txt is the file instructing bots how to treat your blog – which directories to crawl and which not to. Having a wrong setup of this file might cause search engines to not crawl – and index – your blog or allow malicious bots to access sensitive information.
You and everybody else can see your robots.txt file by typing your site’s URL and adding /robots.txt at the end, that is http://www.mysite.com/robots.txt. In this sense, you shouldn’t believe that you can hide from others how you have set up your robots.txt file.
It is also important to note that, especially malicious bots, are known to disregard the permissions you set. This is the reason why, if you host some sensitive information, as for example credit card details, you must implement additional protection in form of antivirus software and SSL.
The original robots.txt file is not something you create, it is generated by WordPress. In most cases it doesn’t need tweaking but sometimes things go wrong and you end up with something like:
Such a setting instructs bots to not crawl your site or any files in connection to it. If you change this to, for example:
you’ll be giving all bots full access to all your files and directories.
In most cases, it is best to allow access to some directories and restrict others containing important information. An example of a classic robots.txt file can be the following:
This robots.txt setting messes, however, with Google’s way of rendering your site since the Panda update. Therefore, from a search engine optimization point, it is a good practice to allow access to wp-includes. Thus, an optimized version would look like:
The only difficulty in relation to editing the file is finding and accessing it. Sometimes it can be placed in the root directory of your site on the server but sometimes it is completely missing there. The reason is that robots.txt is a virtual file and will only be available on the server if it has been re-written before.
Also, if you’ve chosen to host your site on WordPress.com, for example, you have no access to file management the way you do on own server, which makes finding and editing files a bit challenging.
Basically, you have two options in regards to editting robots.txt – via a plugin or via uploading/editing the file on your server.
If you choose the first option, you can use a SEO plugin, such as All in One SEO Pack which allows you to create or edit your robots.txt file directly from the control panel of your site. A downside here might be potential incompatibility between the plugin and your theme. This is also the only option to edit robots.txt, if hosting your site on WordPress.com or a similar service.
If you choose to edit the file on your server, you have to create a robots.txt file and upload it into the root directory of your site. The root directory is, for the most of it, called public_html and always contains the directories wp-admin, wp-includes, and wp-content.
You can either format robots.txt as suggested above or choose a fitting configuration. Just remember to check with Google, whether you’re not disallowing too much or something you shouldn’t be disallowing. This is done by performing a robots.txt test in Google Search Console.