Join the community today
Become a Member

WebPerf Block Bad Bots – New Security Feature From KeyCDN

Discussion in 'All Internet & Web Performance News' started by eva2000, Mar 10, 2016.

Tags:
  1. eva2000

    eva2000 Administrator Staff Member

    29,044
    6,590
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +9,785
    Local Time:
    11:41 PM
    Nginx 1.13.x
    MariaDB 5.5
    KeyCDN is always looking for ways to improve its service and so we are excited to announce a new security feature, and that is the ability to block bad bots. This allows customers to save on bandwidth costs and block bad bots, spiders, and scrapers from crawling your CDN assets. This feature is now available to all customers and can be enabled from the KeyCDN dashboard. No more bots draining your credits!


    Bad Bots


    When it comes to the web, there are good bots and bad bots. An example of a good bot would be Googlebot. Googlebot is Google’s web crawling bot which crawls people’s new content and adds it to their search engine for indexing. An example of a bad bot would be Cheesebot. Bad bots can include spiders, crawlers, and scrapers. They are not always malicious, however most of the time it is also not necessary that they crawl your site. They consume your CDN bandwidth, take up server resources, and steal your content. You can see a more comprehensive list of bots at botreports.com.

    [​IMG]

    Typically you can block bad bots with your robots.txt file (which you can edit from the KeyCDN dashboard). However, not all robots honor this file, which means it must be done at the server level. KeyCDN uses a comprehensive list of known bad bots and blocks them based on their User-Agent string. This is something we have had implemented in our own environment for a while now and we wanted to open it up to all KeyCDN customers so that everyone could benefit from it.

    How to Enable the Block Bad Bots Feature


    The block bad bots feature is enabled by default on new zones. You can enable it on your existing zones by following the steps below.

    1. Login to the KeyCDN dashboard and click into zones. [​IMG]
    2. Click “Edit” on the zone you want to enable this new feature on. [​IMG]
    3. Select “Show Advanced Features.”
      [​IMG]
    4. Scroll down to “Block Bad Bots” and select “enabled.” Then make sure to save your changes.
      [​IMG]
    451 HTTP Error Status Code


    When a bad bot, who is blocked, hits our edge servers a 451 HTTP error status code is returned. Don’t forget you can always run a live tail on your zone or whole account using our real-time logs. {"zone":"yourzonename","status":"451"}. The HTTP 451 error code was approved by the IESG on December 18, 2015 and is intended to be used when resource access is denied for legal reasons, e.g. censorship or government-mandated blocked access. We chose to use 451 as opposed to 403, 404, or 405 because those are generally used for troubleshooting and thought it best to keep them separated. Read our more in-depth post on analyzing CDN traffic to your website.

    Blocking Bad Bots on Your Origin Server


    [​IMG]

    Using the new feature above will only block bad bots on your CDN assets. You can also block bad bots from accessing your origin server. If you wanted to block multiple User-Agent strings at once, you could add the following to your .htaccess file.

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} ^.*(agent1|Cheesebot|Catall Spider).*$ [NC]
    RewriteRule .* - [F,L]

    Or you can also use the BrowserMatchNoCase directive like this:

    BrowserMatchNoCase "agent1" bots
    BrowserMatchNoCase "Cheesebot" bots
    BrowserMatchNoCase "Catall Spider" bots

    Order Allow,Deny
    Allow from ALL
    Deny from env=bots

    And here is an example on Nginx.

    if ($http_user_agent ~ (agent1|Cheesebot|Catall Spider) ) {
    return 403;
    }


    If you are running a popular CMS there are also extensions and plugins available that can be used to block bots. See our security guides:

    Summary


    KeyCDN is committed to providing you further ways to decrease your bandwidth costs while providing additional security. We are excited that we could open up this new feature to the public. If you have any questions please feel free to comment below or join us in community for a longer discussion on blocking bad bots.

    Related Articles

    Featured

    250GB Free Traffic
    Supercharge your Website Today with KeyCDN
    HTTP/2 – Free SSL – RESTful API – 24+ POPs – Instant Purge

    The post Block Bad Bots – New Security Feature From KeyCDN appeared first on KeyCDN Blog.

    Continue reading...