Results 1 to 10 of 10

Thread: Conserving Resources

  1. #1
    RCull is offline Nearly a Glow Sage
    Join Date
    Dec 2014
    Posts
    19

    Default Conserving Resources

    I need to find a way to reduce cpu and memory use. My site is based on a Vbulletin Forum and a fair amount of static reference pages.

    I notice in stats the robots. Some are "Unknown". Is there a way to stop unknown robots, and are there disadvantages to stopping them?

    I look at the "Hosts, Top 25" shows a Ukraine IP using a lot of resources. Too many for the rare Ukrainian user of the site. Should I just ban an IP like that? I don't want to make the site difficult for international users.

    I have activated CloudFlare.

    Thanks for pointing down the right roads to helping with this problem.

    Bob

  2. #2
    AndrewGlow is offline Master Glow Jedi
    Join Date
    Sep 2009
    Posts
    1,243

    Default

    Hello Bob,

    If there is some specific IP that is causing much load - then you can ban it. Blocking one IP should not cause issues for many customers.

    The main problem is that Cloudflare works at www.yourdomain.com. Bots still can access the server directly via yourdomain.com link.

    I suggest you creating a robots.txt file in your public_html folder and blocking all bots except of those you'd like to visit your website.

    For example, to allow only bots from Google, Yahoo and MSN visiting your website, you can create a robots.txt file with following contents:

    User-agent: googlebot
    Disallow:

    User-agent: Slurp
    Disallow:

    User-agent: msnbot
    Disallow:

    User-agent: *
    Disallow: /
    Also, you should be able to see in stats what exact file is usually accessed by bots. If it is something like login.php being attacked by bad bots, we can provide you with instructions to rename that file properly. Please open a support ticket for this.
    Have no fear,
    GlowHost is Here!

  3. #3
    RCull is offline Nearly a Glow Sage
    Join Date
    Dec 2014
    Posts
    19

    Default

    So something like this would allow googlebot and slurp into all directories except the ones listed as "Disallow" and all other agents would not be allowed anywhere:

    User-agent: googlebot
    Disallow: /forums/classifieds
    Disallow: /cgi-bin/
    Disallow: /data/
    Disallow: /download/
    Disallow: /facebook/
    Disallow: /generator/
    Disallow: /include/
    Disallow: /reference/library/images
    Disallow: /temp/
    Disallow: /test/
    Disallow: /upload/
    Disallow: /webstats/
    Sitemap: http://www.teambuick.com/sitemap.xml

    User-agent: Slurp
    Disallow: /forums/classifieds
    Disallow: /cgi-bin/
    Disallow: /data/
    Disallow: /download/
    Disallow: /facebook/
    Disallow: /generator/
    Disallow: /include/
    Disallow: /moderators/
    Disallow: /openx/
    Disallow: /reference/library/images
    Disallow: /temp/
    Disallow: /test/
    Disallow: /upload/
    Disallow: /webstats/
    Sitemap: http://www.teambuick.com/sitemap.xml

    User-agent: *
    Disallow: /

  4. #4
    AndrewGlow is offline Master Glow Jedi
    Join Date
    Sep 2009
    Posts
    1,243

    Default

    Yes, this looks like a solution that should work.
    Have no fear,
    GlowHost is Here!

  5. #5
    Matt's Avatar
    Matt is offline GlowHost Administrator
    Join Date
    Jan 2005
    Location
    Behind your monitor
    Posts
    5,960

    Default

    It's important to know that "bad" bots do not always honor the instructions that you have in the robots.txt - You may end up having to ban those IPs directly.

    "Good" bots like Slurp and googlebot will honor your instructions.
    Send your friends and site visitors to GlowHost and get $125 plus bonus!
    GlowHost Affiliate Program | Read our Blog | Follow us on X |

  6. #6
    RCull is offline Nearly a Glow Sage
    Join Date
    Dec 2014
    Posts
    19

    Default

    I have added this to my htaccess, but I am getting this error:
    [Sat Sep 19 06:34:08 2015] [alert] [client 217.73.208.152] /home/teambuic/public_html/.htaccess: Invalid command 'User-agent:', perhaps misspelled or defined by a module not included in the server configuration

    I don't think it is related to spelling.

  7. #7
    RCull is offline Nearly a Glow Sage
    Join Date
    Dec 2014
    Posts
    19

    Default

    It also brought the site down with a 500 error.

  8. #8
    RCull is offline Nearly a Glow Sage
    Join Date
    Dec 2014
    Posts
    19

    Default

    OPPS figured it out

  9. #9
    AndrewGlow is offline Master Glow Jedi
    Join Date
    Sep 2009
    Posts
    1,243

    Default

    Nice to hear that!
    Have no fear,
    GlowHost is Here!

  10. #10
    AndrewGlow is offline Master Glow Jedi
    Join Date
    Sep 2009
    Posts
    1,243

    Default

    Just to ask, what was the problem in this exact issue?
    Have no fear,
    GlowHost is Here!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

1 2 3 4 5 6 7 8 9 10 11 12 13 14