Welcome to Centmin Mod Community
Become a Member

Easy way to find bots and block them from logs

Discussion in 'System Administration' started by pamamolf, Mar 23, 2015.

  1. pamamolf

    pamamolf Premium Member Premium Member

    4,071
    427
    83
    May 31, 2014
    Ratings:
    +833
    Local Time:
    12:29 AM
    Nginx-1.25.x
    MariaDB 10.3.x
    Hi

    I am looking for an easy way to find bots or wired user agents from log file so i can block them but my access.log is very big and is not so easy to search each line :(

    Any other way or any useful grep command?

    Thanks

     
    Last edited: Mar 23, 2015
  2. Steve Tozer

    Steve Tozer Member

    70
    42
    18
    Jul 28, 2014
    South Wales, UK
    Ratings:
    +49
    Local Time:
    10:29 PM
    1.91
    10.0.19
    Hello,

    The closest I can get it at the moment is using which will extract them into the bots.txt file

    Code:
    grep 'spider\|bot' access.log | sort -u -f >> bots.txt
    Still trying to work out how to just print out the spider / bot name and remove the duplicates
     
  3. pamamolf

    pamamolf Premium Member Premium Member

    4,071
    427
    83
    May 31, 2014
    Ratings:
    +833
    Local Time:
    12:29 AM
    Nginx-1.25.x
    MariaDB 10.3.x
    Thanks Steve :)
    Is there anything else that can use with spider/bot so i can get also other known bad enties?

    I thought that -u was for unique ......
     
  4. Steve Tozer

    Steve Tozer Member

    70
    42
    18
    Jul 28, 2014
    South Wales, UK
    Ratings:
    +49
    Local Time:
    10:29 PM
    1.91
    10.0.19
    I think spider and bots is most common one if you can find a list of other ones you could add them into the grep command or even grep the file against a list.

    I have a solution its a bit hacky but works:

    Code:
    grep 'spider\|bot' access.log | sort -u -f >> bots.txt
    grep -o -E '\w+' bots.txt | sort -u -f >> bots2.txt
    grep 'spider\|bot' bots2.txt
    To input the final results into a file to make it easier:

    Code:
    grep 'spider\|bot' bots2.txt >> botsfinal.txt
    There's prob a cleaner way but it works ;)
     
    Last edited: Mar 23, 2015
  5. Steve Tozer

    Steve Tozer Member

    70
    42
    18
    Jul 28, 2014
    South Wales, UK
    Ratings:
    +49
    Local Time:
    10:29 PM
    1.91
    10.0.19
    You could even make the above into a quick script

    Code:
    #!/bin/bash
    grep 'spider\|bot' access.log | sort -u -f >> bots.txt
    grep -o -E '\w+' bots.txt | sort -u -f >> bots2.txt
    grep 'spider\|bot' bots2.txt >> botsfinal.txt
    rm -f bots.txt bots2.txt
    The results in botsfinal.txt that you want to block, you could add them into:
    Code:
    /usr/local/nginx/conf/block.conf
    With an additional
    Code:
    if ($http_user_agent ~ "Name of user agent") {
            set $block_user_agents 1;
        }
    Also aslong as the below line is uncommented in your Vhost.

    Code:
    # block common exploits, sql injections etc
    include /usr/local/nginx/conf/block.conf;
    Once added to block.conf and that line is uncommented restart nginx for the changes to take affect. Sorry if thats a bit long winded :woot: