robots.txt

The robots.txt file is a simple text file placed on a website’s root directory. It tells web-spiders, like search engine bots, which pages or sections of the site they are allowed or disallowed to access.

robots.txt controls bot traffic to avoid overloading servers or indexing private/irrelevant pages.

robots.txt tends to leak path where a robot should not index content, which is usually a place where hackers tries to login.

Documentation

Related : Webscraping, Web-spider, Search Engine

Related packages : fkrzski/robots-txt