how-to tips   0   799
Adding Robots.txt file to Django Application

Robots.txt is a standard used by websites to communicate with web crawlers and other web robots. The standard specifies how to inform the web robot about which areas of the website should not be processed or scanned.


Why robots.txt is important:

Before a search engine crawls your site, it will look at your robots.txt file as instructions on where they are allowed to crawl/visit and index on the search engine results.

If you want search engines to ignore any  pages on your website, you mention it in your robots.txt file.


Basic Format:

User-agent: [user-agent name]
Disallow: [URL string not to be crawled]

Example:

      User-agent: Mediapartners-Google
      Disallow:

      User-agent: TruliaBot
      Disallow: /

      User-agent: *
      Disallow: /search.html

      User-agent: *
      Disallow: /comments/*

      User-agent: Mediapartners-Google*
      Disallow:


Steps to add robots.txt in Your Django Project:

Lets say your project's name is myproject.

Create a directory templates in root location of your project.

Create another directory with the same name as your project inside templates directory.

Place a text file robots.txt in it.

Your project structure should look something like this.

myproject
 |
 |--maypp
 |--myproject
 |    |--settings.py
 |    |--urls.py
 |    |--wsgi.py
 |--templates
 |    |--myproject
 |    |   |--robots.txt


Add user-agent and disallow URL in it.

User-agent: *
Disallow: /admin/
Disallow: /accounts/


Now go to your project's urls.py file and add below import statement

from django.views.generic import TemplateView


Add below URL pattern.

urlpatterns += [
    url(r'^robots\.txt$', TemplateView.as_view(template_name="myproject/robots.txt", content_type='text/plain')),
]


Now restart the server and go to localhost:8000/robots.txt in your browser and you will be able to see the robots.txt file.


Serving robots.txt from web server:

You can serve robots.txt directly from your web server.

Below is the sample configuration for apache.

<Location "/robots.txt">
 SetHandler None
 Require all granted
</Location>
Alias /robots.txt /var/www/html/project/robots.txt

Quick Tips:
  1. robots.txt is case sensitive. The file must be named robots.txt, not Robots.txt or robots.TXT.
  2. robots.txt file must be placed in a website’s top-level directory.
  3. Make sure you’re not blocking any content or sections of your website you want crawled as this will not be good for SEO.


Host your Django App for Free.
how-to tips   0   799

Related Articles:
Python Script 3: Validate, format and Beautify JSON string Using Python
Validating json using python code, format and beautify json file using python, validate json file using python, how to validate, format and beautify json...
Encryption-Decryption in Python Django
How to encrypt and decrypt the content in DJango. Encrypting the critical information in Django App. Encrypting username, email and password in Django...
How to upgrade to paid account on PythonAnyWhere
Which is the best server for hosting Django Apps. Best hosting provider for Django Apps. Cheapest Django Hosting. PythonAnyWhere Reviews. Django Hosting....
Scraping 10000 tweets in 60 seconds using celery, RabbitMQ and Docker cluster with rotating proxy
Scrapping large amount of tweets within minutes using celery, RabbitMQ and docker cluster. Scraping huge data quickly using docker cluster with TOR....

0 thoughts on 'Adding Robots.Txt File To Django Application'
Leave a comment:


*All Fields are mandatory. **Email Id will not be published publicly.


SUBSCRIBE
Please subscribe to get the latest articles in your mailbox.



Recent Posts:






© pythoncircle.com 2018-2019