Results 1 to 1 of 1

Thread: [How To] - Block Bad Bots from accessing your Website

  1. #1
    GlowHost-James's Avatar
    GlowHost-James is offline Administrator
    Join Date
    Apr 2012
    Posts
    191

    Default [How To] - Block Bad Bots from accessing your Website

    What is a Bad Bot?
    They can be thought of as the bots or spiders that do more harm than good to your website.

    An example of a bad bot would be an email harvester which scans your web page code for email addresses that can then be used to send spam to. Another example is an unwanted bot which consumes too much bandwidth, or causes the load to go up on your server, causing it to go slow or at worst, completely offline due to overload.

    While the worst of the "Bad Bots" will ignore your robots.txt directives completely, there are some bots that are not necessarily intending to be a Bad Bot, but may they may be unwanted by you. For the bots that ignore your robots.txt file, they would need to be blocked by using user-agent directives in your .htaccess file, but that topic is beyond the scope of this simple guide.

    For the bots that are not intending to be malicious, but sometimes are, we can take care of them in your robots.txt file. For example, if you have a site based in the USA, you may not want bots from Non-English speaking countries coming in and eating up your bandwidth or other resources. Many bots will follow your rules, and this simple guide can help you to control the bots which access your site.

    How to block Bad Bots
    Follow these steps to block the bad bots and spiders from accessing your website.

    Step 1:
    Open your favorite text editor and create a file called robots.txt.

    Step 2:
    Place the following code in this file.
    Code:
    # Deny all robots that we do not specifically want to allow
    User-agent: *
    Disallow: /
    
    # Allow these robots only
    User-agent: googlebot
    Allow: /
    The code above will block all bots from accessing your website, with the exception of Google (googlebot).

    **See the end of this post for more search engines / robots that are safe to add to your robots.txt file.

    Step 3:
    Save the file and upload it to your public_html directory. You can upload it via FTP or through the cPanel file manager.

    More Good Bots to allow
    The example above only uses Googlebot. There are others that you may want to add to your robots.txt file. Here are a few.

    • Googlebot-News - Google News
    • Googlebot-Image - Google Images
    • Googlebot-Mobile - Google Mobile
    • MSNBot - Microsoft MSN
    • Teoma - Teoma Search
    • bingbot - Bing Search
    • Slurp - yahoo! Search
    • Scooter - AltaVista Search
    • Scrubby - Scrub the Web


    You can add them into the robots.txt file in the following format:
    Code:
    User-agent: BOTNAME
    Allow: /
    Where BOTNAME is the name of the bot listed above.

    So one example of a robots.txt file which bans all robots except yahoo, bing, and google might look like this:

    Code:
    # Deny all robots that we do not specifically want to allow
    User-agent: *
    Disallow: /
    
    # Allow these robots only
    User-agent: slurp
    Allow: / 
    
    User-agent: bingbot
    Allow: /
    
    User-agent: googlebot
    Allow: /
    If you have any further questions, please feel free to register and post a reply in this thread.
    Last edited by Matt; 04-30-2013 at 09:26 PM.

Similar Threads

  1. GlowHost Spam-O-Matic Bad Words List
    By Matt in forum Programming Talk
    Replies: 7
    Last Post: 10-15-2013, 09:45 PM
  2. Apache Bad Request
    By charlesh in forum General Support
    Replies: 11
    Last Post: 02-11-2010, 06:31 AM
  3. Replies: 6
    Last Post: 12-05-2007, 07:18 PM
  4. Olga - Bad hard Disk
    By Matt in forum Outages and Scheduled Maintenance
    Replies: 0
    Last Post: 04-05-2007, 12:29 PM
  5. Bad CGI Errors.. Need Urget Help Please
    By FiberglassForum in forum General Support
    Replies: 1
    Last Post: 11-22-2005, 09:38 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17