Robots.txt File

Robots.txt file is an important term which we should learn to understand its importance in crawling. 

What is the Robots.txt file?

Robots.txt contains? for Robots Exclusion Protocol (REP).  By inserting robots.txt file on your website, you can disallow crawlers for crawling to a specific webpage of your website.

When the crawler visits your website, initially it checks for the robots.txt file. If it is not present then it will crawl all the crawlable information present on your website.

Robots.txt file
Google Spider

Syntax for Robots.txt files:

User-agent: user agent name (not necessary)

Disallow: URL string not to be crawled

Meaning of technical terms associated in robots.txt syntax:

User-agent: Here you specify the name of user-agent (Bot name example: Googlebot, Bingbot etc.)

Disallow: Here you specify the URL or file you do not want to crawl

Allow (only for Googlebot): Allow the crawler to crawl the subfolder despite the main folder being disallowed

Crawl-delay: The amount of time Googlebot should wait to crawl the page (generally Googlebot ignores this command)

Sitemap: To tell the crawler the location of sitemap present for that URL

Now let’s take some examples to better understand the robots.txt file:

Case1: We allow the crawlers to crawl the entire website (Never do this)

User-agent: *

Disallow:

Case2: We disallow the crawlers, not to crawler services section of our website www.example.com/services

User-agent: *

Disallow: /services/

Case3: Apply robots.txt file to the entire website and block crawler to crawl it

User-agent: *

Disallow: /

Case4: Disallow a specific crawler not to crawl your website

User-agent: Bingbot

Disallow: /

Robots.txt SEO
My Website Robots FIle

Where to insert the robots.txt file:

After making robots.txt file (generally done on a notepad by saving the file in .txt format) you must upload the same on the root directory (Homepage) of your website. Generally, the crawlers only crawl the root directory for the robots.txt file if they don’t find it there they assume that this website does not have the robots.txt file.

How to insert the robots.txt file?

To know how to insert the robots file click on this link.

Next Technical Term: Sitemaps

Summary
Introduction to Robots.txt file
Article Name
Introduction to Robots.txt file
Description
Robots.txt file helps us by restricting search engines crawling sensitive information from the website. Read the article to know more.
Author
Publisher Name
Jatin Srivastava
Publisher Logo

Jatin

I am a blogger in the field of Digital Marketing where my focus area is Search Engine Optimization and Yoast Readability analysis.

Leave a Reply

Close Menu