Learn about Centmin Mod LEMP Stack today
Register Now

Benchmarks Optimizing AMD EPYC governor setting to reduce TTFB

Discussion in 'Dedicated server hosting' started by deltahf, Oct 5, 2021.

Tags:
  1. deltahf

    deltahf Premium Member Premium Member

    586
    264
    63
    Jun 8, 2014
    Ratings:
    +487
    Local Time:
    11:00 AM
    After taking delivery of my AMD EPYC 7001-series server, I mentioned I was disappointed in the Centminmod install time, and @eva2000 noticed I had not tuned the CPU's clock/power management profile.

    I was not familiar with that or how to do it, so I did some research. I will share what I learned here along with some interesting benchmarks.

    This AMD developer document is a tuning guide for EPYC CPUs. See sections 6.5-6.6 regarding the governors:

    http://developer.amd.com/wp-content/resources/56420.pdf

    I confirmed that my server was running in "conservative" mode at only ~1.2 Ghz.

    Code (Text):
    $ cpupower frequency-info
    analyzing CPU 0:
      driver: acpi-cpufreq
      CPUs which run at the same hardware frequency: 0
      CPUs which need to have their frequency coordinated by software: 0
      maximum transition latency:  Cannot determine or is not supported.
      hardware limits: 1.20 GHz - 2.10 GHz
      available frequency steps:  2.10 GHz, 1.70 GHz, 1.20 GHz
      available cpufreq governors: conservative userspace powersave ondemand performance
      current policy: frequency should be within 1.20 GHz and 2.10 GHz.
                      The governor "conservative" may decide which speed to use
                      within this range.
      current CPU frequency: 1.20 GHz (asserted by call to hardware)
      boost state support:
        Supported: yes
        Active: yes
        Boost States: 0
        Total States: 3
        Pstate-P0:  2100MHz
        Pstate-P1:  1700MHz
        Pstate-P2:  1200MHz
    


    Code (Text):
    $ cpupower monitor
        |Mperf               || Idle_Stats
    CPU | C0   | Cx   | Freq || POLL | C1   | C2
       0|  0.12| 99.88|  1198||  0.00|  0.44| 99.45
       8|  0.08| 99.92|  1198||  0.00|  0.28| 99.65
       1|  0.11| 99.89|  1199||  0.00|  0.43| 99.47
       9|  0.05| 99.95|  1199||  0.00|  0.26| 99.70
       2|  0.13| 99.87|  1199||  0.00|  0.52| 99.37
      10|  0.15| 99.85|  1198||  0.00|  0.17| 99.69
       3|  0.96| 99.04|  1199||  0.00|  0.51| 98.55
      11|  0.14| 99.86|  1197||  0.00|  0.26| 99.61
       4|  0.11| 99.89|  1200||  0.00|  0.21| 99.69
      12|  0.09| 99.91|  1197||  0.00|  0.38| 99.57
       5|  0.08| 99.92|  1200||  0.00| 10.54| 89.36
      13|  0.04| 99.96|  1195||  0.00|  7.37| 92.52
       6|  0.13| 99.87|  1199||  0.00|  0.10| 99.78
      14|  0.14| 99.86|  1198||  0.00|  0.27| 99.54
       7|  0.19| 99.81|  1198||  0.00|  1.46| 98.36
      15|  0.21| 99.79|  1199||  0.00|  1.03| 98.77
    


    It's very easy to change to performance mode!

    Code (Text):
    $ cpupower frequency-set -g performance
    Setting cpu: 0
    Setting cpu: 1
    Setting cpu: 2
    Setting cpu: 3
    Setting cpu: 4
    Setting cpu: 5
    Setting cpu: 6
    Setting cpu: 7
    Setting cpu: 8
    Setting cpu: 9
    Setting cpu: 10
    Setting cpu: 11
    Setting cpu: 12
    Setting cpu: 13
    Setting cpu: 14
    Setting cpu: 15
    



    Code (Text):
    $ cpupower frequency-info
    analyzing CPU 0:
      driver: acpi-cpufreq
      CPUs which run at the same hardware frequency: 0
      CPUs which need to have their frequency coordinated by software: 0
      maximum transition latency:  Cannot determine or is not supported.
      hardware limits: 1.20 GHz - 2.10 GHz
      available frequency steps:  2.10 GHz, 1.70 GHz, 1.20 GHz
      available cpufreq governors: conservative userspace powersave ondemand performance
      current policy: frequency should be within 1.20 GHz and 2.10 GHz.
                      The governor "performance" may decide which speed to use
                      within this range.
      current CPU frequency: 2.10 GHz (asserted by call to hardware)
      boost state support:
        Supported: yes
        Active: yes
        Boost States: 0
        Total States: 3
        Pstate-P0:  2100MHz
    


    Code (Text):
     $ cpupower monitor
        |Mperf               || Idle_Stats
    CPU | C0   | Cx   | Freq || POLL | C1   | C2
       0|  0.05| 99.95|  2618||  0.00|  0.50| 99.46
       8|  0.04| 99.96|  2657||  0.00|  0.29| 99.68
       1|  0.04| 99.96|  2614||  0.00|  0.32| 99.65
       9|  0.03| 99.97|  2647||  0.00|  0.68| 99.29
       2|  0.17| 99.83|  2783||  0.00|  0.55| 99.29
      10|  0.08| 99.92|  2555||  0.00|  0.28| 99.66
       3|  0.06| 99.94|  2619||  0.00|  0.71| 99.24
      11|  0.03| 99.97|  2641||  0.00|  0.78| 99.19
       4|  0.05| 99.95|  2633||  0.00|  0.59| 99.36
      12|  0.04| 99.96|  2632||  0.00|  0.42| 99.55
       5|  0.04| 99.96|  2633||  0.00| 10.59| 89.38
      13|  0.02| 99.98|  2806||  0.00|  0.37| 99.61
       6|  0.06| 99.94|  2622||  0.00|  0.20| 99.75
      14|  0.08| 99.92|  2559||  0.00|  0.27| 99.66
       7|  0.07| 99.93|  2720||  0.00|  0.62| 99.31
      15|  0.08| 99.92|  2813||  0.00|  0.83| 99.10
    


    Changing this setting had a significant impact on Time-To-First-Byte (TTFB) page generation times for both WordPress and XenForo 2.2.

    TTFB time in milliseconds by page type (PHP 7.4, MariaDB 10.4):

    Screen Shot 2021-10-04 at 3.40.15 PM.png

    Enabling performance mode reduced TTFB by ~26% on XF pages ~43% on WordPress pages. It also brought TTFB times below those of my older Intel Xeon E3-1230v5 CPU, even though it is running at a much higher clock speed compared to the EPYC (3.4Ghz vs 2.1Ghz). This shows the expected improvement of the more modern EPYC architecture, and of course it also has double the cores and threads of the E3-1230.

    Huge thanks to @eva2000 for mentioning the governor tuning. I know it is common knowledge for you professional sysadmins, but this is my first AMD server and I wasn't aware of it!

    Now I need to start tuning WordPress to get those TTFB times down even more...
     
  2. rdan

    rdan Well-Known Member

    5,444
    1,408
    113
    May 25, 2014
    Ratings:
    +2,201
    Local Time:
    12:00 AM
    Mainline
    10.2
    What if you try tuned?
    Code:
    yum install tuned -y
    systemctl enable --now tuned
    
    tuned-adm active
    tuned-adm profile latency-performance
     
  3. eva2000

    eva2000 Administrator Staff Member

    54,368
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    2:00 AM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    Love it when folks share their own journeys of discovery and back them up with benchmarks of the before and after :D (y) :cool:
     
  4. eva2000

    eva2000 Administrator Staff Member

    54,368
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    2:00 AM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    Be careful with that, sometimes doesn't do what you expected depending on Linux Kernel and CPU models used. Also test and benchmark before and after results :D
     
  5. eva2000

    eva2000 Administrator Staff Member

    54,368
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    2:00 AM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
  6. rdan

    rdan Well-Known Member

    5,444
    1,408
    113
    May 25, 2014
    Ratings:
    +2,201
    Local Time:
    12:00 AM
    Mainline
    10.2
    Any more infos about this?
    An article or forum discussion?
    Thanks.
     
  7. eva2000

    eva2000 Administrator Staff Member

    54,368
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    2:00 AM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    From personal experience ;) :D CPU clock speeds might not work as expected resulting in lower performance during testing/benchmarking real loads.
     
  8. Matt Williams

    Matt Williams WordPress Fanatic

    537
    104
    43
    Nov 22, 2014
    Virginia, USA
    Ratings:
    +157
    Local Time:
    11:00 AM
    latest
    10

    WOW! Holy Shyt!! You have helped me with this! Damn! my server was only running on 1.2Ghz but now... Shew! The speed is unreal! Thank you! Thank You!
     
  9. pamamolf

    pamamolf Premium Member Premium Member

    4,077
    427
    83
    May 31, 2014
    Ratings:
    +833
    Local Time:
    6:00 PM
    Nginx-1.25.x
    MariaDB 10.3.x
    Is it only for AMD cpu's ?

    I think also after rebooting the performance profile is gone...

    Code:
    cpupower frequency-set -g performance
    If i am not wrong it is not persistence.
     
    Last edited: Apr 18, 2022
  10. deltahf

    deltahf Premium Member Premium Member

    586
    264
    63
    Jun 8, 2014
    Ratings:
    +487
    Local Time:
    11:00 AM
    Awesome, I'm so glad it helped you! I'm glad I made the thread now. :D

    You are correct, cpupower is not persistent.

    I'm using tuned-adm with my the Intel processor on my newer server, and it is persistent.

    Use tuned-adm list to see the power profiles available on your system and show which one is active, then run tuned-adm profile profile-name-here to choose one of those profiles. I'm using "throughput-performance" and documented the improvements here.
     
  11. pamamolf

    pamamolf Premium Member Premium Member

    4,077
    427
    83
    May 31, 2014
    Ratings:
    +833
    Local Time:
    6:00 PM
    Nginx-1.25.x
    MariaDB 10.3.x
    The opposite question now :)

    Does tuned-adm works with AMD?
     
  12. eva2000

    eva2000 Administrator Staff Member

    54,368
    12,198
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,763
    Local Time:
    2:00 AM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    Hacky way is add the command to /etc/rc.local which applies whatever is in that on server reboot. Centmin Mod already has a few entries in /etc/rc.local on initial installations :)
     
  13. rdan

    rdan Well-Known Member

    5,444
    1,408
    113
    May 25, 2014
    Ratings:
    +2,201
    Local Time:
    12:00 AM
    Mainline
    10.2
    This works for me if I'm not mistaken.
    systemctl enable cpupower
    systemctl start cpupower
     
  14. pamamolf

    pamamolf Premium Member Premium Member

    4,077
    427
    83
    May 31, 2014
    Ratings:
    +833
    Local Time:
    6:00 PM
    Nginx-1.25.x
    MariaDB 10.3.x
    @rdan

    If you can restart and verify that is persistence after using:

    Code:
    systemctl enable cpupower
    systemctl start cpupower
    please let us know....
     
  15. rdan

    rdan Well-Known Member

    5,444
    1,408
    113
    May 25, 2014
    Ratings:
    +2,201
    Local Time:
    12:00 AM
    Mainline
    10.2
    Yes, already tried it last 2 weeks and last few days ago.
    It retain the settings.

    But cannot test it again today, maybe after several months :).
     
  16. rdan

    rdan Well-Known Member

    5,444
    1,408
    113
    May 25, 2014
    Ratings:
    +2,201
    Local Time:
    12:00 AM
    Mainline
    10.2
    Yes, tested with 3 dedicated servers now.
    It retain the performance governor.