Discover Centmin Mod today
Register Now

Matomo on Centminmod

Discussion in 'Other Web Apps usage' started by deltahf, Jul 2, 2023.

Tags:
  1. deltahf

    deltahf Premium Member Premium Member

    576
    259
    63
    Jun 8, 2014
    Ratings:
    +475
    Local Time:
    7:31 AM
    With the official sunsetting of Google Analytics UA today, I have been exploring other options to use in addition to GA4.

    It looks like the best option for me will be the self-hosted version of Matomo running on Centminmod. I see there has been some discussion about Matomo here over the years, but now that UA is officially dead and gone, I thought there might be more people looking to run Matomo on Centminmod servers and wanted to create a thread here.

    Why Matomo?



    Matomo seems to be the most similar to UA, but it's also different. Specifically, it supports Custom Dimensions which I found to be extremely powerful in GA and are an essential feature for my next analytics software.

    I use them to sort traffic by page type, forums and sub-forums (you can use it to see which threads in which sub-forums get the most traffic), WordPress page categories, and WordPress authors. You can still do this in GA4, but the reporting system is so clunky it's not as useful.

    Another huge advantage for Matomo is the ability to import your old data from Google Analytics. I have been using GA from day one and have almost 18 years worth of data stored in it that I do not want to lose. This would be a good way to save it for future reference.

    Server Requirements



    I would prefer to use Matomo Cloud so I don't have to worry about scalability or hosting. However, my site serves ~3M page views monthly, and with Matomo Cloud that would cost me nearly $1,000 USD per month, which I cannot afford.

    Fortunately they offer a self-hosted version with plugins that you can buy with annual subscriptions you can purchase which which is more affordable.

    Matomo offers recommended server specifications for the self hosted option appear to be rather excessive:

    Of course, everyone here will be spoiled with Centminmod's high performance, but unless Matomo is doing a ton of data processing behind the scenes I cannot believe that it needs so much processing power for analytics software. Am I underestimating its demands here?

    For those of you already hosting Matomo, what are your server specs? How is performance?

    I have plenty of overhead and horsepower on my site's dedicated server to run Matomo, but I would prefer to run it on a separate machine. A lot of my site's pages are enhanced/protected by various caches and can sometimes receive flash crowds of traffic. If every one of those page views hits my analytics software, I would not want it to put any load on the primary site. I would rather the analytics server get overwhelmed and go down instead of the site itself.

    Your Thoughts?



    So, who here is running Matomo on Centminmod? Anyone have any issues or thoughts?

     
  2. eva2000

    eva2000 Administrator Staff Member

    51,742
    11,946
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,438
    Local Time:
    9:31 PM
    Nginx 1.25.x
    MariaDB 10.x
    Thanks @deltahf for reminding us! Matomo was originally called Piwik Analytics and I did have a Piwik instance on Centmin Mod LEMP stack once upon a time but no longer. See my write up at https://community.centminmod.com/threads/piwik-analytics-centmin-mod-nginx-vhost-configuration.4455/.

    Might be worth revisiting Matomo - like the idea of importing old GA data :)

    No idea about Matomo cloud though - sounds very expensive!

    From what I remember for Piwik, yes more traffic = more data = more cpu/mem/disk resources needed to process that data and create reports and stuff.
     
  3. eva2000

    eva2000 Administrator Staff Member

    51,742
    11,946
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,438
    Local Time:
    9:31 PM
    Nginx 1.25.x
    MariaDB 10.x
    @deltahf OVH currently have alot of discount deals for dedicated servers - a cheap Matomo server maybe? The OVH Eco range seem nice Eco dedicated servers for Aussies and for USA folks Eco dedicated servers

     
  4. eva2000

    eva2000 Administrator Staff Member

    51,742
    11,946
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,438
    Local Time:
    9:31 PM
    Nginx 1.25.x
    MariaDB 10.x
    Some info for you

    Different features of Matomo use different system resources (CPU, memory, and disk). Here is a general outline:

    1. Data Collection (CPU, Network, Disk): When visitors come to your site, Matomo collects data by executing JavaScript code in the visitor's browser, which sends data back to the Matomo server. On the server side, these requests are relatively light, but they can add up with high traffic, increasing CPU and network load. The collected data is then written to the database, using disk I/O.

    2. Archiving (CPU, Memory, Disk): Matomo processes the collected data in a step called archiving. This is where it calculates all the metrics and reports. This process is CPU and memory intensive, and can also generate significant disk I/O as data is read from and written to the database. The frequency of the archiving process can be configured. More frequent archiving allows more up-to-date reports, but uses more resources.

    3. Real-Time Reporting (CPU, Memory): Matomo's real-time reports show what's happening on your site right now. These reports are more demanding because they require Matomo to constantly process recent data. The more visitors your site has, the harder your server will have to work to keep these reports up-to-date.

    4. Segmentation (CPU, Memory): Segments are a way to view your reports based on specific criteria. For example, you could create a segment for visitors from a particular country, or visitors who used a particular keyword. When you view a segmented report, Matomo has to process the data on-the-fly, which can be CPU and memory intensive, especially for large data sets.

    5. Data Retention (Disk): Matomo stores all its data in a MySQL/MariaDB database. The more data you retain, the more disk space you'll need. Matomo has settings to automatically delete old data, which can help manage disk usage.
     
  5. deltahf

    deltahf Premium Member Premium Member

    576
    259
    63
    Jun 8, 2014
    Ratings:
    +475
    Local Time:
    7:31 AM
    Oh, GREAT idea! :D I didn't realize their prices were so low. I was a little worried about a VPS being a little under powered but one of these configurations would give me a lot more bang for the buck. I'm not used to shopping in the lower end of the dedicated server market but something like this should be fine.

    Good stuff. I'm expecting it to be fairly memory and CPU intensive.

    I need to research to see if there is any indication how big the imported Google Analytics data would actually be. It could easily be hundreds or even terabytes of data...

    Here's what Phind (GPT-4) had to say. Looks like it could take a long time to import as you can only export roughly 4 months of data from Google at a time...

     
  6. eva2000

    eva2000 Administrator Staff Member

    51,742
    11,946
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,438
    Local Time:
    9:31 PM
    Nginx 1.25.x
    MariaDB 10.x
    Yeah for you the important part might be disk size for MySQL database if you want to keep data for a long time. OVH servers 450GB NVMe raid 1 might not be enough?? You can always just test it on dedicated servers that charge hourly. There's CherryServers (affiliate link) that offer hourly billed dedicated servers to test disk usage for imports. More expensive for monthly for hourly billed, you wouldn't use the full month

    i.e.
    • AMD EPYC 7313P 16C/32T
    • 64GB ECC Registered memory
    • 2x1TB NVMe raid 0 for 2TB storage (for testing only)
    • 100GB backup storage
    • 3Gbp/s uplink
    • 100TB bandwidth
    • Chicago
    • Spot Server 0.294 USD/hr
    • Hourly 0.612 USD/hr = US$14.688/day
    • Monthly 357.48 USD/mo
    other cheaper servers from CherryServers too
     
  7. deltahf

    deltahf Premium Member Premium Member

    576
    259
    63
    Jun 8, 2014
    Ratings:
    +475
    Local Time:
    7:31 AM
    Do you have an affiliate code for OVH too?
     
  8. eva2000

    eva2000 Administrator Staff Member

    51,742
    11,946
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,438
    Local Time:
    9:31 PM
    Nginx 1.25.x
    MariaDB 10.x
    OVH does not have an affiliate code as their prices sell themselves lol
     
  9. deltahf

    deltahf Premium Member Premium Member

    576
    259
    63
    Jun 8, 2014
    Ratings:
    +475
    Local Time:
    7:31 AM
    Found a lot of good info on this page about tuning Matomo performance:

    Configure Matomo for speed - Analytics Platform - Matomo

    I have around 1 billion page views in GA, so if those get imported that would be a MySQL database size of ~200GB with 10-12GB growth per year at current traffic levels.

    Now for the bad news, importing from GA can be tricky:

    It is also very slow due to Google Analytics API rate limiting, which cannot be bypassed, as discussed by Phind above:

    Running the Google Analytics import - Analytics Platform - Matomo

    In my case this means it could take several months to import data, and unless I'm willing to wait until that import completes, I can't start tracking new/current data in Matomo unless I start tracking it in a completely separate "profile"/"site". Wish I had started looking into this a while ago!
     
  10. eva2000

    eva2000 Administrator Staff Member

    51,742
    11,946
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,438
    Local Time:
    9:31 PM
    Nginx 1.25.x
    MariaDB 10.x
    I'd separate imported GA analytics data from your site just makes things easier
     
  11. wmtech

    wmtech Active Member

    165
    44
    28
    Jul 22, 2017
    Ratings:
    +125
    Local Time:
    1:31 PM
    We use Matomo for several years for 15 of our sites.

    It is installed using CMM at a medium sized VPS (which is also several years old now) and runs without any problems.
     
  12. duderuud

    duderuud Premium Member Premium Member

    196
    71
    28
    Dec 5, 2020
    The Netherlands
    Ratings:
    +150
    Local Time:
    1:31 PM
    1.25 x
    10.6
    I am using a cheap Hetzner CPX21 with Matomo for quite a while now for our big board.

    No real issues whatsoever. Only thing that happens from time to time is a directory permission issue after an upgrade but that is easily fixed. Very happy with Matomo!
     
  13. deltahf

    deltahf Premium Member Premium Member

    576
    259
    63
    Jun 8, 2014
    Ratings:
    +475
    Local Time:
    7:31 AM
    I have finally completed setup of my own dedicated Matomo server, configured with Centminmod 130.00beta01, running PHP 8.2 and MariaDB 10.6.

    Server details:

    OVH (BHS2 Datacenter / Canada)
    Intel Xeon E3-1245v2 - 4c/8t - 3.4 GHz/3.8 GHz
    32 GB 1333 MHz
    Dual 480GB SSDs in RAID1

    Setup was a bit of a hassle, but nothing too bad and I'm happy to have it up and running and have been pleased with Matomo so far. It is on par with my initial estimates of ~1GB new data created per month for my site, which gets around 600,000 monthly uniques. I have started the advanced Google UA import process as a separate "website" in Matomo using the Google Analytics Reporting API, but it will likely take a few months to download all of that data and I am still waiting for it to show up.

    I will share a few more details about my installation and configuration for anyone else who might be thinking of doing this.

    Grant File Privileges



    You need to setup the ability for Matomo’s database user to load files directly into MariaDB. In MariaDB as the root user, first run:

    Code (Text):
    mysql> grant file on *.* to 'matomo'@'%';
    mysql> FLUSH PRIVILEGES;
    


    Next, you need to edit /etc/my.cnf and add the following line to both the [mysql] and [mysqld] sections of the file:

    Code (Text):
    local-infile=1
    


    Auto Archiving with Cron Jobs



    You need to configure the cron jobs to run at five minutes after every hour for even moderately sized sites, as described here.

    Nginx Configuration



    I integrated the suggested Nginx configuration directives from this file into the existing file generated by Centminmod:

    https://github.com/matomo-org/matomo-nginx/blob/master/sites-available/matomo.conf

    This caused broken images in the Matomo admin panel. They were returning a 403 error, and the error logs showed they were being blocked by access rules but it was not immediately obvious where they were. After a long time of searching, I found it was actually related to this include, which was inserted/generated automatically by Centminmod:

    Code (Text):
    include /usr/local/nginx/conf/autoprotect/stats.mydomain.com/autoprotect-stats.mydomain.com.conf;
    


    It was blocking everything inside of a /plugins directory (a directory which Matomo uses and serves files directly out of) with a 403 error. Once I removed this line from the stats.mydomain.com vhost file, they started working as expected.

    Additional PHP-FPM Configuration



    Matomo needs to use shell_exec(). It is disabled by default in Centminmod and can be enabled by commenting out the following line in /usr/local/etc/php-fpm.conf:

    Code (Text):
    ;php_admin_value[disable_functions] = shell_exec
    


    Then restart Nginx and PHP-FPM.

    Additional Security Hardening with Cloudflare Zero Trust



    I am concerned about security with Matomo, so I went an extra step and secured my install with Cloudflare Zero Trust. This requires you to authenticate with an email (or any other authentication provider you configure in Cloudflare) before Cloudflare will allow any access. In my case, Matomo is running on a subdomain, stats.mydomain.com, so I put that entire subdomain behind Zero Trust.

    However, the public still needs to access matomo.js and matomo.php. This can be accomplished by creating a secondary Application in Zero Trust and assigning specific paths to those files for it:

    Screenshot 2023-10-08 at 12.26.50 AM.png

    Next, create a policy with a "Bypass" action and create rules to include "Everyone". These specific URL matches will override the more broad policy which applies to all of "stats.mydomain.com".

    Of course, the Matomo installation is also secured with Cloudflare Authenticated Origin Pulls but I don't think this is necessary for Zero Trust to work.

    Hope this helps someone.
     
  14. eva2000

    eva2000 Administrator Staff Member

    51,742
    11,946
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,438
    Local Time:
    9:31 PM
    Nginx 1.25.x
    MariaDB 10.x
    Very nice addition to the setup (y)

    1GB data per month is that for live Matomo only right? Not including Google Analytics import?

    Thanks for sharing your adventures for Matomo on Centmin Mod :cool:
     
  15. duderuud

    duderuud Premium Member Premium Member

    196
    71
    28
    Dec 5, 2020
    The Netherlands
    Ratings:
    +150
    Local Time:
    1:31 PM
    1.25 x
    10.6
    Nice writeup! I tried CMM for my Matomo setup but I gave up and installed Ubuntu. Will give this a try when I have some time left :)
     
  16. deltahf

    deltahf Premium Member Premium Member

    576
    259
    63
    Jun 8, 2014
    Ratings:
    +475
    Local Time:
    7:31 AM
    Correct, that's 1GB/month for new (live) data coming in each month.

    Matomo's performance is excellent with Centminmod. :)
     
  17. eva2000

    eva2000 Administrator Staff Member

    51,742
    11,946
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,438
    Local Time:
    9:31 PM
    Nginx 1.25.x
    MariaDB 10.x
  18. deltahf

    deltahf Premium Member Premium Member

    576
    259
    63
    Jun 8, 2014
    Ratings:
    +475
    Local Time:
    7:31 AM
    Cheers, yes, I am already using my own key. :)
     
  19. duderuud

    duderuud Premium Member Premium Member

    196
    71
    28
    Dec 5, 2020
    The Netherlands
    Ratings:
    +150
    Local Time:
    1:31 PM
    1.25 x
    10.6
    Still running my good old Ubuntu 20.04 server because I couldn't get Matomo to run correctly on CMM.

    But I wanted to try again. Still running in issues with the specific Matomo details in de Nginx configuration.

    Can you maybe share your conf files for Matomo? Things brake once I try to integrate this part:
    Code (Text):
     ## only allow accessing the following php files
        location ~ ^/(index|matomo|piwik|js/index|plugins/HeatmapSessionRecording/configs)\.php$ {
            include snippets/fastcgi-php.conf; # if your Nginx setup doesn't come with a default fastcgi-php config, you can fetch it from https://github.com/nginx/nginx/blob/master/conf/fastcgi.conf
            try_files $fastcgi_script_name =404; # protects against CVE-2019-11043. If this line is already included in your snippets/fastcgi-php.conf you can comment it here.
            fastcgi_param HTTP_PROXY ""; # prohibit httpoxy: https://httpoxy.org/
            fastcgi_pass unix:/var/run/php/php7.2-fpm.sock; #replace with the path to your PHP socket file
            #fastcgi_pass 127.0.0.1:9000; # uncomment if you are using PHP via TCP sockets (e.g. Docker container)
        }
    


    and all following snippets in the demo conf file.

    I know I can get the same results by using CF zero trust (like you mentioned) but the Matomo system check keeps nagging access to certain php files that should be restricted.
     
  20. deltahf

    deltahf Premium Member Premium Member

    576
    259
    63
    Jun 8, 2014
    Ratings:
    +475
    Local Time:
    7:31 AM
    Sorry it took me a while to reply, December has been crazy busy but I didn't forget about your post here. :)

    Here is the relevant section of my Nginx conf file:

    Code (Text):
    # Matomo Stuff
      # https://github.com/matomo-org/matomo-nginx/blob/master/sites-available/matomo.conf
      add_header Referrer-Policy origin always; # make sure outgoing links don't show the URL to the Matomo instance
    
      ## only allow accessing the following php files
      location ~ ^/(index|matomo|piwik|js/index|plugins/HeatmapSessionRecording/configs)\.php$ {
        include /usr/local/nginx/conf/php.conf; # we have to include this again here inside of this location block
        try_files $fastcgi_script_name =404; # protects against CVE-2019-11043. If this line is already included in your snippets/fastcgi-php.conf you can comment it here.
        fastcgi_param HTTP_PROXY ""; # prohibit httpoxy: https://httpoxy.org/
      }
    
      # LOCATION DIRECTIVES
      location / {
        try_files $uri $uri/ =404;
    
        include /usr/local/nginx/conf/503include-only.conf;
    
        # Wordpress Permalinks example
        #try_files $uri $uri/ /index.php?q=$uri&$args;
    
        # Matomo $_SERVER variables
        fastcgi_param MM_COUNTRY_CODE $geoip2_data_country_code;
        fastcgi_param MM_CONTINENT_NAME $geoip2_data_continent_name;
        fastcgi_param MM_COUNTRY_CODE $geoip2_data_country_code;
        fastcgi_param MM_COUNTRY_NAME $geoip2_data_country_name;
        fastcgi_param MM_REGION_CODE $geoip2_data_region_iso;
        fastcgi_param MM_REGION_NAME $geoip2_data_region_name;
        fastcgi_param MM_LATITUDE $geoip2_data_location_latitude;
        fastcgi_param MM_LONGITUDE $geoip2_data_location_longitude;
        fastcgi_param MM_POSTAL_CODE $geoip2_data_postal_code;
        fastcgi_param MM_CITY_NAME $geoip2_data_city_name;
        fastcgi_param MM_ISP $geoip2_data_autonomous_system_organization;
        fastcgi_param MM_ORG $geoip2_data_autonomous_system_number;
      }
    
      ## disable all access to the following directories
      location ~ ^/(config|tmp|core|lang) {
        deny all;
        return 403; # replace with 404 to not show these directories exist
      }
    
      location ~ /\.ht {
        deny  all;
        return 403;
      }
    
      location ~ js/container_.*_preview\.js$ {
        expires off;
        add_header Cache-Control 'private, no-cache, no-store';
      }
    
      location ~ ^/(libs|vendor|misc|node_modules) {
        deny all;
        return 403;
      }
    
      location ~ \.(gif|ico|jpg|png|svg|js|css|htm|html|mp3|mp4|wav|ogg|avi|ttf|eot|woff|woff2)$ {
        allow all;
        ## Cache images,CSS,JS and webfonts for an hour
        ## Increasing the duration may improve the load-time, but may cause old files to show after an Matomo upgrade
        expires 1h;
        add_header Pragma public;
        add_header Cache-Control "public";
      }