Learn about Centmin Mod LEMP Stack today
Register Now

Nginx Benchmarks After CentOS Linux Kernel KPTI Meltdown & Spectre Fixes

Discussion in 'CentOS, Redhat & Oracle Linux News' started by eva2000, Jan 10, 2018.

Thread Status:
Not open for further replies.
  1. eva2000

    eva2000 Administrator Staff Member

    35,041
    7,734
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +11,928
    Local Time:
    4:00 AM
    Nginx 1.15.x
    MariaDB 5.5/10.x
    Most recent Linux Kernel updates have been to address Meltdown & Spectre vulnerabilities. The first Linux Kernel patch update was to address Meltdown and reportedly it can result in between 5-30% performance overhead while others report it at 5-20%. Phoronix did some benchmarks comparing KPTI + Retpoline Kernel updates on Ubuntu for Apachebench tests for Nginx and Apache web server requests. So I decided to do my own tests on CentOS 7.4 64bit OVH dedicated server running Centmin Mod 123.09beta01 branch LEMP stack with Nginx 1.13.8 to see how the updated CentOS 7.4 3.10.0-693.11.6 Linux Kernel impacted Nginx static file serving performance with PTI enaabled vs PTI disabled (no Retpoline Kernel fixes are available in CentOS Linux Kernel as yet). I used Centmin Mod installed Siege benchmark v4.0.4 for the tests below.

    CentOS Linux Kernel KPTI Related Tunables



    Redhat/CentOS Kernel updates added a new feature to disable/enable KPTI related fixes so you can test before and after performance etc. Full details here.

    To check if you existing status
    Code (Text):
    cat /sys/kernel/debug/x86/pti_enabled
    cat /sys/kernel/debug/x86/ibpb_enabled
    cat /sys/kernel/debug/x86/ibrs_enabled
    

    Only pti_enabled would usually be enabled right now as Intel cpu microcode updates aren't available to enable the other 2 settings.

    To disable PTI, IPBP and IBRS
    Code (Text):
    echo 0 > /sys/kernel/debug/x86/pti_enabled
    echo 0 > /sys/kernel/debug/x86/ibpb_enabled
    echo 0 > /sys/kernel/debug/x86/ibrs_enabled
    


    Test System


    • OVH MC-32 Intel Core i7 4790K
    • 32GB Memory
    • 2x240GB SSD
    • 250Mbit Network Bandwidth
    • CentOS 7.4 64bit
    • Centmin Mod 123.09beta01 LEMP stack - Nginx 1.13.8, MariaDB 10.1.30 MySQL, PHP 7.2.1
    • BHS, Canada
    Nginx Version
    • Nginx 1.13.8 compiled with GCC 7.2.1 against OpenSSL 1.1.0g

    Test Parameters



    Siege version
    Code (Text):
    siege -V
    SIEGE 4.0.4
    
    Copyright (C) 2017 by Jeffrey Fulmer, et al.
    This is free software; see the source for copying conditions.
    There is NO warranty; not even for MERCHANTABILITY or FITNESS
    FOR A PARTICULAR PURPOSE.
    

    Test is against http://localhost which is the default Centmin Mod Nginx HTML page at /usr/local/nginx/html/
    • restart nginx
    • clear caches
    • run siegebench 4.0.4
    Code (Text):
    ngxrestart
    sync && echo 3 > /proc/sys/vm/drop_caches
    siege -b -d1s -c200 -t30s http://localhost
    


    Redhat/CentOS Kernal Page Table Isolation Tunables



    Default CentOS 7.4 PTI, IBPB and IBRS Linux Kernel Tunables as outlined at here.
    • Page Table Isolation (pti) - “nopti”/pti_enabled controls the Kernel Page Table Isolation feature,which isolates kernel pagetables when running in userland. This feature addresses CVE-2017-5754, also called variant #3, or Meltdown.
    • Indirect Branch Restricted Speculation (ibrs) - “noibrs”/ibrs_enabled controls the IBRS feature in the SPEC_CTRL model-specific register (MSR) when SPEC_CTRL is present in cpuid (post microcode update). When ibrs_enabled is set to 1 the kernel runs with indirect branch restricted speculation, which protects the kernel space from attacks (even from hyperthreading/simultaneous multi-threading attacks). When IBRS is set to 2, both userland and kernel runs with indirect branch restricted speculation. This protects userspace from hyperthreading/simultaneous multi-threading attacks as well, and is also the default on AMD processors (family 10h, 12h and 16h). This feature addresses CVE-2017-5715, variant #2.
    • Indirect Branch Prediction Barriers (ibpb) - “noibpb”/ibpb_enabled controls the IBPB feature in the PRED_CMD model-specific register (MSR) if either IBPB_SUPPORT or SPEC_CTRL is present in cpuid (post microcode update). When ibpb_enabled is set to 1, an IBPB barrier that flushes the contents of the indirect branch prediction is run across user mode or guest mode context switches to prevent user and guest mode from attacking other applications or virtual machines on the same host. In order to protect virtual machines from other virtual machines, ibpb_enabled=1 is needed even if ibrs_enabled is set to 2. If ibpb_enabled is set to 2, indirect branch prediction barriers are used instead of IBRS at all kernel and hypervisor entry points (in fact, this setting also forces ibrs_enabled to 0). ibpb_enabled=2 is the default on CPUs that don’t have the SPEC_CTRL feature but only IBPB_SUPPORT. ibpb_enabled=2 doesn’t protect the kernel against attacks based on simultaneous multi-threading (SMT, also known as hyperthreading); therefore, ibpb_enabled=2 provides less complete protection unless SMT is also disabled. This feature addresses CVE-2017-5715, variant #2.
    Code (Text):
    cat /sys/kernel/debug/x86/pti_enabled
    1
    
    cat /sys/kernel/debug/x86/ibpb_enabled
    0
    
    cat /sys/kernel/debug/x86/ibrs_enabled
    0
    

    Only PTI is enabeld as the Intel i7 4790K does not yet have microcode updates from Intel for IPBP and IBRS
    Code (Text):
    journalctl -b --no-pager | grep microcode | sed -e "s|$(hostname)|hostname|g"
    Jan 05 14:44:46 hostname kernel: microcode: microcode updated early to revision 0x22, date = 2017-01-27
    Jan 05 14:44:46 hostname kernel: microcode: CPU0 sig=0x306c3, pf=0x2, revision=0x22
    Jan 05 14:44:46 hostname kernel: microcode: CPU1 sig=0x306c3, pf=0x2, revision=0x22
    Jan 05 14:44:46 hostname kernel: microcode: CPU2 sig=0x306c3, pf=0x2, revision=0x22
    Jan 05 14:44:46 hostname kernel: microcode: CPU3 sig=0x306c3, pf=0x2, revision=0x22
    Jan 05 14:44:46 hostname kernel: microcode: CPU4 sig=0x306c3, pf=0x2, revision=0x22
    Jan 05 14:44:46 hostname kernel: microcode: CPU5 sig=0x306c3, pf=0x2, revision=0x22
    Jan 05 14:44:46 hostname kernel: microcode: CPU6 sig=0x306c3, pf=0x2, revision=0x22
    Jan 05 14:44:46 hostname kernel: microcode: CPU7 sig=0x306c3, pf=0x2, revision=0x22
    Jan 05 14:44:46 hostname kernel: microcode: Microcode Update Driver: v2.01 <[email protected]>, Peter Oruba
    Jan 05 14:44:47 hostname systemd[1]: Starting Load CPU microcode update...
    Jan 05 14:44:48 hostname systemd[1]: Started Load CPU microcode update.
    


    Tests



    Siege Benchmark with PTI Disabled

    Code (Text):
    echo 0 > /sys/kernel/debug/x86/pti_enabled
    ngxrestart
    sync && echo 3 > /proc/sys/vm/drop_caches
    comment='d1s-c200-t30s-pti-disabled'
    siege -b -d1s -c200 -t30s -m $comment http://localhost | tee "siege-${comment}-$(date +"%d%m%y-%H%M%S").log"
    


    Siege Benchmark with PTI Enabled

    Code (Text):
    echo 1 > /sys/kernel/debug/x86/pti_enabled
    ngxrestart
    sync && echo 3 > /proc/sys/vm/drop_caches
    comment='d1s-c200-t30s-pti-enabled'
    siege -b -d1s -c200 -t30s -m $comment  http://localhost | tee "siege-${comment}-$(date +"%d%m%y-%H%M%S").log"
    


    ovh-4790k-nginx1138-test1.png

    chart-ovh-4790k-nginx1138-test1.png

    Summary


    • The negative performance impact for KPTI + Retpoline Kernel that Phoronix reported for Nginx web serving on Ubuntu OS was between 21-26% on Intel i9 7980XE, Intel E3-1280v5 and Intel i7 6800K.
    • Performance impact for my Siege benchmark results on CentOS with Intel i7 4790K was around 5.5% average of 5 runs for request rate and for longest transaction time around ~4.08%. However, Centmin Mod Nginx users can regain that lost performance still obtain better than default out of box Nginx and LEMP stack performance (up to 40-900% better) by implementing configuration tips outlined in Insight Guide - How to boost Centmin Mod LEMP stack performance.
    • I suspect one of the reasons why Centmin Mod Nginx performance impact was lower was due to the fact that Centmin Mod Nginx doesn't use default system glibc memory allocator but is configured to use jemalloc. MariaDB folks benchmarked the KPTI overhead for MariaDB MySQL and found the largest performance impact for them was with default glibc. Using tcmalloc or jemalloc reduced the KPTI overhead and performance impact for MySQL read only workloads from around a peak of 12.6% to 2.5% and MySQL read/write workloads from around peak of 27.65% to 2.69%.
     
    • Informative Informative x 2
..
Thread Status:
Not open for further replies.