Join the community today
Register Now

Sysadmin Compressed Log Files Rotation With Facebook zstd For Smaller Log Sizes

Discussion in 'System Administration' started by eva2000, Jan 1, 2019.

Thread Status:
Not open for further replies.
  1. eva2000

    eva2000 Administrator Staff Member

    39,829
    8,788
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +13,526
    Local Time:
    12:58 PM
    Nginx 1.15.x
    MariaDB 5.5/10.x
    Centmin Mod LEMP stack sets up automatic server log rotation for all it's logs including Nginx, PHP-FPM, MySQL etc. Usually, these logs when rotated are compressed using standard gzip compression at default level 6. However, if you have alot of traffic, the logs will be very large and take up alot of space - some sizes have logs in hundreds of gigabytes!. An alternate is to switch from gzip compress to more efficient Facebook's zstd compression which compresses files better and faster than gzip. You can see my zstd vs gzip vs bzip2 vs xz vs brotli benchmarks here & here as well as look at how tar + zstd produces smaller compressed file backups here.

    On example server, I have the following logrotation profiles setup at /etc/logrotate.d/
    Code (Text):
    ls -lAh /etc/logrotate.d/      
    total 72K
    -rw-r--r--  1 root root  122 Jul 26 14:39 amplify-agent
    -rw-r--r--  1 root root  160 Sep 15  2017 chrony
    -rw-r--r--  1 root root  187 May 10  2017 cyrus-imapd
    -rw-r--r--. 1 root root  172 Aug  6  2017 lfd
    -rw-r--r--. 1 root root  452 Apr  4  2017 mysql
    -rw-r--r--. 1 root root  603 Apr  4  2017 mysql-slowlog
    -rw-r-----  1 root named 514 Oct 31 00:29 named
    -rw-r--r--. 1 root root  512 Jan  1 06:50 nginx
    -rw-r--r--  1 root root  237 Dec  5 13:08 php70-php-fpm
    -rw-r--r--  1 root root  237 Dec  8 12:33 php71-php-fpm
    -rw-r--r--  1 root root  237 Dec 18 13:12 php72-php-fpm
    -rw-r--r--  1 root root  237 Dec 18 16:10 php73-php-fpm
    -rw-r--r--. 1 root root  316 Jan 15  2018 php-fpm
    -rw-r--r--  1 root root  136 Sep  9  2004 ppp
    -rw-r--r--. 1 root root   66 Dec 23  2015 pure-ftpd
    -rw-r--r--  1 root root  127 Mar 23  2017 redis
    -rw-r--r--  1 root root  224 Oct 30 14:49 syslog
    -rw-r--r--  1 root root  103 Nov  5 01:53 yum
    

    To switch from gzip to zstd compression for logrotate, you first need to have zstd installed. Centmin Mod 123.09beta01 and higher users can install zstd via centmin.sh menu option 17 to install all multi-threaded compression tools.
    Code (Text):
    --------------------------------------------------------
        Centmin Mod Menu 123.09beta01 centminmod.com
    --------------------------------------------------------
    1).  Centmin Install
    2).  Add Nginx vhost domain
    3).  NSD setup domain name DNS
    4).  Nginx Upgrade / Downgrade
    5).  PHP Upgrade / Downgrade
    6).  XCache Re-install
    7).  APC Cache Re-install
    8).  XCache Install
    9).  APC Cache Install
    10). Memcached Server Re-install
    11). MariaDB MySQL Upgrade & Management
    12). Zend OpCache Install/Re-install
    13). Install/Reinstall Redis PHP Extension
    14). SELinux disable
    15). Install/Reinstall ImagicK PHP Extension
    16). Change SSHD Port Number
    17). Multi-thread compression: pigz,pbzip2,lbzip2...
    18). Suhosin PHP Extension install
    19). Install FFMPEG and FFMPEG PHP Extension
    20). NSD Install/Re-Install
    21). Update - Nginx + PHP-FPM + Siege
    22). Add Wordpress Nginx vhost + Cache Plugin
    23). Update Centmin Mod Code Base
    24). Exit
    --------------------------------------------------------
    Enter option [ 1 - 24 ] 17
    

    Code (Text):
    zstd -V
    *** zstd command line interface 64-bits v1.3.8, by Yann Collet ***
    


    Modify Log Rotation To Use Zstd Compression



    Then you need to edit your logrotate profiles to add the settings right after the delaycompress setting
    Code (Text):
    compresscmd /usr/local/bin/zstd
    uncompresscmd /usr/local/bin/unzstd
    compressoptions -9 --long -T0
    compressext .zst
    

    • Compression level 9 is used for zstd which produces smaller files than gzip level 6 default and level 9 max. But if you want even smaller compressed logs at equivalent gzip compression speed, you can change level 9 (-9) to level 12 (-12).
    • Also zstd use --long for long range mode for better compression ratios as outlined here. If you have more than 2GB of free usable memory you can also further reduce compressed log file size by change --long to --long=31 to allocate 2GB window for zstd compression to work with. By default --long uses 128MB window size.
    • -T0 tells zstd to use multi-threaded compression value equal to number of cpu cores available.
    For example my nginx logrotation profile /etc/logrotate.d/nginx becomes
    Code (Text):
    /var/log/nginx/*.log /usr/local/nginx/logs/*.log /home/nginx/domains/*/log/*.log {
            daily
            dateext
            missingok
            rotate 10
            maxsize 500M
            compress
            delaycompress
            compresscmd /usr/local/bin/zstd
            uncompresscmd /usr/local/bin/unzstd
            compressoptions -9 --long -T0
            compressext .zst
            notifempty
            postrotate
            /bin/kill -SIGUSR1 $(cat /usr/local/nginx/logs/nginx.pid 2>/dev/null) 2>/dev/null || true
            endscript
    }
    

    Then you can test logrotate via debug mode to see what it would do for nginx logrotate profile - it's only debug test where no actual logrotation is done yet.
    Code (Text):
    logrotate -df /etc/logrotate.d/nginx
    

    starting few lines will pick up the compression program to use which is zstd at level 9 compression with .zst extension for compressed log files
    Code (Text):
    logrotate -df /etc/logrotate.d/nginx
    reading config file /etc/logrotate.d/nginx
    compress_prog is now /usr/local/bin/zstd
    uncompress_prog is now /usr/local/bin/unzstd
    compress_options is now  -9 --long -T0
    compress_ext is now .zst
    Allocating hash table for state file, size 15360 B
    
    Handling 1 logs
    
    rotating pattern: /var/log/nginx/*.log /usr/local/nginx/logs/*.log /home/nginx/domains/*/log/*.log  forced from command line (10 rotations)
    empty log files are not rotated, log files >= 524288000 are rotated earlier, old logs are removed
    considering log /var/log/nginx/cfcomp-access.log
      log needs rotating
    considering log /var/log/nginx/localhost.access.log
      log needs rotating
    considering log /var/log/nginx/localhost.error.log
      log needs rotating
    considering log /usr/local/nginx/logs/access.log
      log does not need rotating (log is empty)considering log /usr/local/nginx/logs/error.log
      log does not need rotating (log is empty)considering log /home/nginx/domains/cpdomain.com/log/access.log
      log does not need rotating (log is empty)considering log /home/nginx/domains/cpdomain.com/log/error.log
      log does not need rotating (log is empty)considering log /home/nginx/domains/demodomain.com/log/access.log
      log does not need rotating (log is empty)considering log /home/nginx/domains/demodomain.com/log/error.log
      log does not need rotating (log is empty)considering log /home/nginx/domains/domain9.com/log/access.log
      log does not need rotating (log is empty)considering log /home/nginx/domains/domain9.com/log/error.log
      log does not need rotating (log is empty)considering log
    

    Now if you want to really test and force rotate the logs run without -d debug flag
    Code (Text):
    logrotate -fv /etc/logrotate.d/nginx
    

    Checking /var/log/nginx for Nginx main hostname logs, you'll see most recent rotation with .zst extension for zstd compressed files
    Code (Text):
    ls -lah /var/log/nginx
    total 1.8M
    drwxr-xr-x.  2 root  root 4.0K Jan  1 07:09 .
    drwxr-xr-x. 16 root  root 4.0K Jan  1 03:29 ..
    -rw-r--r--   1 nginx root 179K Jan  1 07:09 cfcomp-access.log
    -rw-r--r--   1 nginx root  74K Dec 23 03:31 cfcomp-access.log-20181223.gz
    -rw-r--r--   1 nginx root 135K Dec 24 03:36 cfcomp-access.log-20181224.gz
    -rw-r--r--   1 nginx root  75K Dec 25 03:47 cfcomp-access.log-20181225.gz
    -rw-r--r--   1 nginx root  85K Dec 26 03:20 cfcomp-access.log-20181226.gz
    -rw-r--r--   1 nginx root  59K Dec 27 03:40 cfcomp-access.log-20181227.gz
    -rw-r--r--   1 nginx root  74K Dec 28 03:18 cfcomp-access.log-20181228.gz
    -rw-r--r--   1 nginx root  60K Dec 29 03:31 cfcomp-access.log-20181229.gz
    -rw-r--r--   1 nginx root  76K Dec 30 03:42 cfcomp-access.log-20181230.gz
    -rw-r--r--   1 nginx root  83K Dec 31 03:48 cfcomp-access.log-20181231.gz
    -rw-r--r--   1 nginx root  72K Jan  1 03:29 cfcomp-access.log-20190101.zst
    -rw-rw----   1 nginx root 180K Jan  1 07:09 localhost.access.log
    -rw-rw----   1 nginx root  47K Dec 23 03:31 localhost.access.log-20181223.gz
    -rw-rw----   1 nginx root  83K Dec 24 03:36 localhost.access.log-20181224.gz
    -rw-rw----   1 nginx root  47K Dec 25 03:47 localhost.access.log-20181225.gz
    -rw-rw----   1 nginx root  54K Dec 26 03:20 localhost.access.log-20181226.gz
    -rw-rw----   1 nginx root  38K Dec 27 03:40 localhost.access.log-20181227.gz
    -rw-rw----   1 nginx root  47K Dec 28 03:18 localhost.access.log-20181228.gz
    -rw-rw----   1 nginx root  39K Dec 29 03:31 localhost.access.log-20181229.gz
    -rw-rw----   1 nginx root  47K Dec 30 03:42 localhost.access.log-20181230.gz
    -rw-rw----   1 nginx root  52K Dec 31 03:48 localhost.access.log-20181231.gz
    -rw-rw----   1 nginx root  46K Jan  1 03:29 localhost.access.log-20190101.zst
    -rw-rw----   1 nginx root  962 Jan  1 07:08 localhost.error.log
    -rw-rw----   1 nginx root 1.5K Dec 23 02:27 localhost.error.log-20181223.gz
    -rw-rw----   1 nginx root 5.8K Dec 24 03:34 localhost.error.log-20181224.gz
    -rw-rw----   1 nginx root 1.8K Dec 25 01:33 localhost.error.log-20181225.gz
    -rw-rw----   1 nginx root 1.2K Dec 26 02:21 localhost.error.log-20181226.gz
    -rw-rw----   1 nginx root 1.6K Dec 26 22:20 localhost.error.log-20181227.gz
    -rw-rw----   1 nginx root 1.3K Dec 28 02:34 localhost.error.log-20181228.gz
    -rw-rw----   1 nginx root 1.8K Dec 28 23:33 localhost.error.log-20181229.gz
    -rw-rw----   1 nginx root 1.8K Dec 30 02:52 localhost.error.log-20181230.gz
    -rw-rw----   1 nginx root 1.7K Dec 31 01:53 localhost.error.log-20181231.gz
    -rw-rw----   1 nginx root 2.5K Jan  1 03:19 localhost.error.log-20190101.zst
    

    same with domain9.com's logs
    Code (Text):
    ls -lah /home/nginx/domains/domain9.com/log
    total 160K
    drwxr-s--- 2 nginx nginx 4.0K Jan  1 07:09 .
    drwxr-s--- 6 nginx nginx 4.0K May 29  2017 ..
    -rw-r--r-- 1 nginx nginx  60K Jan  1 06:41 access.log
    -rw-r--r-- 1 nginx nginx 6.5K Dec 23 03:31 access.log-20181223.gz
    -rw-r--r-- 1 nginx nginx 2.3K Dec 24 03:36 access.log-20181224.gz
    -rw-r--r-- 1 nginx nginx 2.3K Dec 25 03:47 access.log-20181225.gz
    -rw-r--r-- 1 nginx nginx 1.5K Dec 26 03:16 access.log-20181226.gz
    -rw-r--r-- 1 nginx nginx 3.7K Dec 27 03:40 access.log-20181227.gz
    -rw-r--r-- 1 nginx nginx 3.5K Dec 28 03:18 access.log-20181228.gz
    -rw-r--r-- 1 nginx nginx 3.8K Dec 29 02:46 access.log-20181229.gz
    -rw-r--r-- 1 nginx nginx 1.6K Dec 30 03:42 access.log-20181230.gz
    -rw-r--r-- 1 nginx nginx  990 Dec 31 02:11 access.log-20181231.gz
    -rw-r--r-- 1 nginx nginx 4.9K Jan  1 03:29 access.log-20190101.zst
    -rw-r--r-- 1 nginx nginx  673 Jan  1 04:39 error.log
    -rw-r--r-- 1 nginx nginx  258 Dec 21 21:52 error.log-20181222.gz
    -rw-r--r-- 1 nginx nginx  797 Dec 22 13:59 error.log-20181223.gz
    -rw-r--r-- 1 nginx nginx 1000 Dec 23 21:24 error.log-20181224.gz
    -rw-r--r-- 1 nginx nginx  304 Dec 24 12:25 error.log-20181225.gz
    -rw-r--r-- 1 nginx nginx  434 Dec 27 02:56 error.log-20181227.gz
    -rw-r--r-- 1 nginx nginx  382 Dec 28 02:22 error.log-20181228.gz
    -rw-r--r-- 1 nginx nginx  382 Dec 28 21:06 error.log-20181229.gz
    -rw-r--r-- 1 nginx nginx  376 Dec 29 12:38 error.log-20181230.gz
    -rw-r--r-- 1 nginx nginx  416 Dec 30 22:58 error.log-20181231.gz
    -rw-r--r-- 1 nginx nginx  637 Dec 31 15:00 error.log-20190101.zst
    


    Inspecting Zstd & Gzip Compressed Log Files



    Inspecting these compressed .zst logs is same as .gz gzip compressed logs, instead of uncompressing entire logs, you can use the equivalent SSH commands to grep and cat.
    • For .gz use zcat and zgrep command. Centmin Mod 123.09beta01 users also have pzcat and pzgrep which are multi-threaded versions of zcat and zgrep that use pigz instead of gzip.
    • For .zst use zstdcat and zstdgrep command
    Example for /home/nginx/domains/domain9.com/log compressed logs for:
    • /home/nginx/domains/domain9.com/log/access.log-20181231.gz
    • /home/nginx/domains/domain9.com/log/access.log-20190101.zst
    Inspect them as follows, i.e. get last line of each log by using zcat or pzcat and piping it through tail command. To view last 100 lines use tail -100 instead.
    Code (Text):
    zcat /home/nginx/domains/domain9.com/log/access.log-20181231.gz | tail -1
    
    pzcat /home/nginx/domains/domain9.com/log/access.log-20181231.gz | tail -1
    

    output for zcat
    Code (Text):
    zcat /home/nginx/domains/domain9.com/log/access.log-20181231.gz | tail -1  
    169.xxx.xxx.xxx - - [31/Dec/2018:01:25:26 +0000] "GET / HTTP/1.0" 200 6040 "-" "Mozilla/5.0(WindowsNT6.1;rv:31.0)Gecko/20100101Firefox/31.0"
    

    output for pzcat
    Code (Text):
    pzcat /home/nginx/domains/domain9.com/log/access.log-20181231.gz | tail -1
    169.xxx.xxx.xxx - - [31/Dec/2018:01:25:26 +0000] "GET / HTTP/1.0" 200 6040 "-" "Mozilla/5.0(WindowsNT6.1;rv:31.0)Gecko/20100101Firefox/31.0"
    

    For .zst compressed files use zstdcat piped through tail command
    Code (Text):
    zstdcat /home/nginx/domains/domain9.com/log/access.log-20190101.zst | tail -1
    

    output
    Code (Text):
    zstdcat /home/nginx/domains/domain9.com/log/access.log-20190101.zst | tail -1
    54.xxx.xxx.xxx - - [01/Jan/2019:03:08:18 +0000] "OPTIONS / HTTP/1.1" 405 552 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"
    

    Using long mode via compressoptions -9 --long=31 -T0 may give errors regarding memory usage for window size
    Code (Text):
    zstdcat cfcomp-access.log-20190101.zst | tail -1
    ess.log-20190101.zst : Decoding error (36) : Frame requires too much memory for decoding
    ess.log-20190101.zst : Window size larger than maximum : 2147483648 > 134217728
    ess.log-20190101.zst : Use --long=31 or --memory=2048MB
    

    In which case pass the appropriate flag i.e. --long=31
    Code (Text):
    zstdcat --long=31 /home/nginx/domains/domain9.com/log/access.log-20190101.zst | tail -1
    54.xxx.xxx.xxx - - [01/Jan/2019:03:08:18 +0000] "OPTIONS / HTTP/1.1" 405 552 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"
    


    To compare compressed sizes for Nginx custom access log cfcomp-access.log-20190101 at 1.8MB size. The .gz compressed log is using gzip default level 6 compression while .zst compressed log is using zstd level 9 compression.
    Code (Text):
    -rw-r--r--   1 nginx root 1.8M Jan  1 03:29 cfcomp-access.log-20190101
    -rw-r--r--   1 nginx root 101K Jan  1 03:29 cfcomp-access.log-20190101.gz
    -rw-r--r--   1 nginx root  72K Jan  1 03:29 cfcomp-access.log-20190101.zst
    

    At zstd level 9 compression the 1.8MB sized cfcomp-access.log-20190101 Nginx access log was compressed to 72KB compared to gzip level 6 compressed size of 101KB. Switching from gzip to zstd compression for log rotation resulted in ~28.7% smaller compressed size !
     
  2. eva2000

    eva2000 Administrator Staff Member

    39,829
    8,788
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +13,526
    Local Time:
    12:58 PM
    Nginx 1.15.x
    MariaDB 5.5/10.x
  3. eva2000

    eva2000 Administrator Staff Member

    39,829
    8,788
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +13,526
    Local Time:
    12:58 PM
    Nginx 1.15.x
    MariaDB 5.5/10.x
  4. eva2000

    eva2000 Administrator Staff Member

    39,829
    8,788
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +13,526
    Local Time:
    12:58 PM
    Nginx 1.15.x
    MariaDB 5.5/10.x
..
Thread Status:
Not open for further replies.