Discover Centmin Mod today
Register Now

Security Blocking bad or aggressive bots

Discussion in 'System Administration' started by eva2000, Feb 28, 2016.

  1. Pepe

    Pepe New Member

    5
    1
    3
    Dec 22, 2016
    Ratings:
    +1
    Local Time:
    8:09 AM
    Hi i've got a question:
    Nothing stops the bot from faking the agent right? He can say it's google bot and bypass the block, can't it?
     
  2. eva2000

    eva2000 Administrator Staff Member

    42,376
    9,569
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +14,747
    Local Time:
    5:09 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    yes if bots fake user agent can't stop them. Though for googlebot or msnbot there's additional more advanced/involved measures you can take like comparing their ips to the known google and microsoft ip ranges owned by them respectively. Then again ips can be spoofed too heh
     
  3. inthecloudblog

    inthecloudblog Active Member

    198
    36
    28
    Jan 26, 2016
    Ratings:
    +83
    Local Time:
    4:09 AM
    1.4.6
    Hi George, any particular reason to block Majestic?
    I participate/help in such project by crawling and we are no evil. Is bandwidth consumption that sets you back?
    Also If you like google to crawl your website I'm aware of people who fake such user agent to scrape data and fake browser also.
    With "us" you can check if the bot that pretends to say is legit or not.

    BTW: I'm about to reach 7 PB of backlinks crawled since 2008 ( been crawling since then) :)

    EDIT: I've not read the full thread but saw majestic being listed. Forgot to mention we obey robots.txt so if you really don't want us it's pretty simple.
     
  4. eva2000

    eva2000 Administrator Staff Member

    42,376
    9,569
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +14,747
    Local Time:
    5:09 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    Just lessening the bot traffic :)
     
    • Informative Informative x 1
  5. inthecloudblog

    inthecloudblog Active Member

    198
    36
    28
    Jan 26, 2016
    Ratings:
    +83
    Local Time:
    4:09 AM
    1.4.6
    I've just edited my post. But do you happen to block google bot for example? ( just curious and to try to understand why people block us)
     
  6. eva2000

    eva2000 Administrator Staff Member

    42,376
    9,569
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +14,747
    Local Time:
    5:09 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    I don't personally, but setup in 1st post this thread can easily switch google to be blocked or rate limited too if folks choose to heh
     
    • Like Like x 1
  7. RB1

    RB1 Active Member

    285
    75
    28
    Nov 11, 2016
    California
    Ratings:
    +122
    Local Time:
    11:09 PM
    Nginx 1.18.x
    MariaDB 10.1.x
    Cool! The software I have installed on my server allows to "block" bots from contaminating visitor data, but this is more preferable because you can block them from crawling your sites in the first place :)
     
    • Agree Agree x 1
  8. eva2000

    eva2000 Administrator Staff Member

    42,376
    9,569
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +14,747
    Local Time:
    5:09 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    Last edited: May 17, 2017
    • Like Like x 1
  9. Matt

    Matt Moderator Staff Member

    848
    372
    63
    May 25, 2014
    Rotherham, UK
    Ratings:
    +579
    Local Time:
    7:09 AM
    1.5.15
    MariaDB 10.2
    I tried implementing this behind Cloudflare, and rate limiting the bots causes Cloudflare to show multiple legitimate visitors errors and showing the site is offline (they see the response codes and mark the site offline).
     
  10. rdan

    rdan Well-Known Member

    4,741
    1,144
    113
    May 25, 2014
    Ratings:
    +1,711
    Local Time:
    3:09 PM
    Mainline
    10.2
  11. eva2000

    eva2000 Administrator Staff Member

    42,376
    9,569
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +14,747
    Local Time:
    5:09 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    yes was my incorrect comment corrected somewhere on the forums and in default config in 123.09beta01 :)
     
    • Informative Informative x 1
  12. rdan

    rdan Well-Known Member

    4,741
    1,144
    113
    May 25, 2014
    Ratings:
    +1,711
    Local Time:
    3:09 PM
    Mainline
    10.2
    I redirect bad bots to my FB page :).
    Instead of:
    Code:
    if ($bot_agent = '3') {
      return 444;
    }
    So a legit user can report or PM me on FB if they will encounter it instead of dropping it directly.
    Code:
    if ($bot_agent = '3') {
    ###return 444;
    return https://www.facebook.com/myfbpage;
    }
     
    • Like Like x 1
    • Funny Funny x 1
  13. rdan

    rdan Well-Known Member

    4,741
    1,144
    113
    May 25, 2014
    Ratings:
    +1,711
    Local Time:
    3:09 PM
    Mainline
    10.2
    • Informative Informative x 2
  14. eva2000

    eva2000 Administrator Staff Member

    42,376
    9,569
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +14,747
    Local Time:
    5:09 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    i offloaded mine to Sucuri to handle now :)
     
    • Informative Informative x 1
  15. rdan

    rdan Well-Known Member

    4,741
    1,144
    113
    May 25, 2014
    Ratings:
    +1,711
    Local Time:
    3:09 PM
    Mainline
    10.2
    I created a specific note for it :).
    Phrase credits to Google :D
    upload_2017-6-3_15-56-28.png
     
    • Useful Useful x 1
  16. eva2000

    eva2000 Administrator Staff Member

    42,376
    9,569
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +14,747
    Local Time:
    5:09 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    LOL didn't know you could do that with facebook pages !
     
  17. pamamolf

    pamamolf Premium Member Premium Member

    3,584
    345
    83
    May 31, 2014
    Ratings:
    +667
    Local Time:
    9:09 AM
    Nginx-1.17.x
    MariaDB 10.3.x
    Another one that we must have on block bots is Netsparker...

    I think this is the user agent:

    Code:
    Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; Netsparker)
    So can we use this one?

    Code:
        if ($http_user_agent ~ "Netsparker") {
            set $block_user_agents 1;
        }
     
    • Like Like x 1
  18. eva2000

    eva2000 Administrator Staff Member

    42,376
    9,569
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +14,747
    Local Time:
    5:09 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
  19. pamamolf

    pamamolf Premium Member Premium Member

    3,584
    345
    83
    May 31, 2014
    Ratings:
    +667
    Local Time:
    9:09 AM
    Nginx-1.17.x
    MariaDB 10.3.x
    So is it better to use:

    Code:
    botlimit.conf
    or

    Code:
    block.conf
    I think we need to totally block such requests and not just limit them ....
     
  20. eva2000

    eva2000 Administrator Staff Member

    42,376
    9,569
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +14,747
    Local Time:
    5:09 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    see first post botlimit.conf can rate limit or block depending on assigned value of 2 or 3 so more advanced