Get the most out of your Centmin Mod LEMP stack
Become a Member

Security Blocking bad or aggressive bots

Discussion in 'System Administration' started by eva2000, Feb 28, 2016.

  1. Pepe

    Pepe New Member

    5
    1
    3
    Dec 22, 2016
    Ratings:
    +1
    Local Time:
    9:34 AM
    Hi i've got a question:
    Nothing stops the bot from faking the agent right? He can say it's google bot and bypass the block, can't it?

     
  2. eva2000

    eva2000 Administrator Staff Member

    54,336
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    6:34 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    yes if bots fake user agent can't stop them. Though for googlebot or msnbot there's additional more advanced/involved measures you can take like comparing their ips to the known google and microsoft ip ranges owned by them respectively. Then again ips can be spoofed too heh
     
  3. inthecloudblog

    inthecloudblog Active Member

    199
    36
    28
    Jan 26, 2016
    Ratings:
    +83
    Local Time:
    5:34 AM
    1.4.6
    Hi George, any particular reason to block Majestic?
    I participate/help in such project by crawling and we are no evil. Is bandwidth consumption that sets you back?
    Also If you like google to crawl your website I'm aware of people who fake such user agent to scrape data and fake browser also.
    With "us" you can check if the bot that pretends to say is legit or not.

    BTW: I'm about to reach 7 PB of backlinks crawled since 2008 ( been crawling since then) :)

    EDIT: I've not read the full thread but saw majestic being listed. Forgot to mention we obey robots.txt so if you really don't want us it's pretty simple.
     
  4. eva2000

    eva2000 Administrator Staff Member

    54,336
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    6:34 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    Just lessening the bot traffic :)
     
  5. inthecloudblog

    inthecloudblog Active Member

    199
    36
    28
    Jan 26, 2016
    Ratings:
    +83
    Local Time:
    5:34 AM
    1.4.6
    I've just edited my post. But do you happen to block google bot for example? ( just curious and to try to understand why people block us)
     
  6. eva2000

    eva2000 Administrator Staff Member

    54,336
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    6:34 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    I don't personally, but setup in 1st post this thread can easily switch google to be blocked or rate limited too if folks choose to heh
     
  7. RB1

    RB1 Active Member

    292
    75
    28
    Nov 11, 2016
    California
    Ratings:
    +122
    Local Time:
    12:34 AM
    Nginx 1.21.x
    MariaDB 10.1.x
    Cool! The software I have installed on my server allows to "block" bots from contaminating visitor data, but this is more preferable because you can block them from crawling your sites in the first place :)
     
  8. eva2000

    eva2000 Administrator Staff Member

    54,336
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    6:34 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    Last edited: May 17, 2017
  9. Matt

    Matt Well-Known Member

    929
    415
    63
    May 25, 2014
    Rotherham, UK
    Ratings:
    +671
    Local Time:
    8:34 AM
    1.5.15
    MariaDB 10.2
    I tried implementing this behind Cloudflare, and rate limiting the bots causes Cloudflare to show multiple legitimate visitors errors and showing the site is offline (they see the response codes and mark the site offline).
     
  10. rdan

    rdan Well-Known Member

    5,444
    1,408
    113
    May 25, 2014
    Ratings:
    +2,201
    Local Time:
    4:34 PM
    Mainline
    10.2
  11. eva2000

    eva2000 Administrator Staff Member

    54,336
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    6:34 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    yes was my incorrect comment corrected somewhere on the forums and in default config in 123.09beta01 :)
     
  12. rdan

    rdan Well-Known Member

    5,444
    1,408
    113
    May 25, 2014
    Ratings:
    +2,201
    Local Time:
    4:34 PM
    Mainline
    10.2
    I redirect bad bots to my FB page :).
    Instead of:
    Code:
    if ($bot_agent = '3') {
      return 444;
    }
    So a legit user can report or PM me on FB if they will encounter it instead of dropping it directly.
    Code:
    if ($bot_agent = '3') {
    ###return 444;
    return https://www.facebook.com/myfbpage;
    }
     
  13. rdan

    rdan Well-Known Member

    5,444
    1,408
    113
    May 25, 2014
    Ratings:
    +2,201
    Local Time:
    4:34 PM
    Mainline
    10.2
  14. eva2000

    eva2000 Administrator Staff Member

    54,336
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    6:34 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    i offloaded mine to Sucuri to handle now :)
     
  15. rdan

    rdan Well-Known Member

    5,444
    1,408
    113
    May 25, 2014
    Ratings:
    +2,201
    Local Time:
    4:34 PM
    Mainline
    10.2
    I created a specific note for it :).
    Phrase credits to Google :D
    upload_2017-6-3_15-56-28.png
     
  16. eva2000

    eva2000 Administrator Staff Member

    54,336
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    6:34 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    LOL didn't know you could do that with facebook pages !
     
  17. pamamolf

    pamamolf Premium Member Premium Member

    4,077
    427
    83
    May 31, 2014
    Ratings:
    +833
    Local Time:
    10:34 AM
    Nginx-1.25.x
    MariaDB 10.3.x
    Another one that we must have on block bots is Netsparker...

    I think this is the user agent:

    Code:
    Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; Netsparker)
    So can we use this one?

    Code:
        if ($http_user_agent ~ "Netsparker") {
            set $block_user_agents 1;
        }
     
  18. eva2000

    eva2000 Administrator Staff Member

    54,336
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    6:34 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
  19. pamamolf

    pamamolf Premium Member Premium Member

    4,077
    427
    83
    May 31, 2014
    Ratings:
    +833
    Local Time:
    10:34 AM
    Nginx-1.25.x
    MariaDB 10.3.x
    So is it better to use:

    Code:
    botlimit.conf
    or

    Code:
    block.conf
    I think we need to totally block such requests and not just limit them ....
     
  20. eva2000

    eva2000 Administrator Staff Member

    54,336
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    6:34 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    see first post botlimit.conf can rate limit or block depending on assigned value of 2 or 3 so more advanced