Welcome to Centmin Mod Community
Register Now

Sysadmin Is there a downtime monitoring service that calls you?

Discussion in 'System Administration' started by deltahf, Jan 6, 2017.

  1. deltahf

    deltahf Premium Member Premium Member

    587
    265
    63
    Jun 8, 2014
    Ratings:
    +489
    Local Time:
    4:57 AM
    I just had a bad morning... woke up to find that my server had crashed for five and a half hours while was sleeping (not sure why, Nginx process at 100% usage, not responding, no clues in the logs). :banghead: I use New Relic for monitoring, and sure enough I got the text message alert and app push notifications from them, but that was not sufficient to wake me up.


    I need a service which will actually do more to seriously alert me if my server is down for an extended period of time. The ideal solution would be something which keeps calling me every few minutes until I answer and acknowledge the alert.

    Does anything like that exist? I know there are a million different monitoring services out there, but they all seem to have the standard email + text message alerts. That's ok, but it doesn't do you much good if you're the only sysadmin and you're asleep.

    How do you guys deal with this issue?
     
  2. eva2000

    eva2000 Administrator Staff Member

    55,237
    12,253
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,833
    Local Time:
    6:57 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    Ouch wonder what caused nginx to hit 100%. First check if any disk partitions ran out of disk space - very common i.e. if nginx or php-fpm errors generated so much log data to fill a disk partition.

    But probably best to use a phone app, ringtone or alert app that is louder or repeats it's alerts for longer too. Think about it what's difference between standard mobile app alert/email/sms alert and a voice call ? The duration of the alert and frequency of repeats and the ringtone used :)

    Don't use these but there's FAQ - PagerDuty
    Basic plan is $29-34/month for 25 global sms/phone notifications per month

    I believe Website Monitoring, Website Monitoring Service, Server Monitoring: Site24x7 also does voice calls from Website Monitoring Service, Sign up for a free 30-day trial: Site24x7
    Standard and Business plans are $9/month and $35/month respectively Website Monitoring Service, Sign up for a free 30-day trial: Site24x7 and has 30 day free trial

    Another is SiteUptime - Website and Server Monitoring Service where SMS/Call alerts only for Standard and Pro plans SiteUptime - Website and Server Monitoring Service $10-20/month

    Another is Website Monitoring & Server Uptime Software » UptimePal cheap $1 per month per url and 30 day free trial !
     
    Last edited: Jan 6, 2017
  3. deltahf

    deltahf Premium Member Premium Member

    587
    265
    63
    Jun 8, 2014
    Ratings:
    +489
    Local Time:
    4:57 AM
    I'm not sure what's going on, but this is the fourth time it's happened in the last month or so. I'm still running 1.11.5 though; I am going to upgrade to 1.11.8 tonight and hope that solves it. Partitions look good:
    Code (Text):
    $ df -h
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/md2        219G   80G  129G  39% /
    tmpfs            16G     0   16G   0% /dev/shm
    /dev/md0        477M  143M  309M  32% /boot
    /dev/md4        902G  528G  328G  62% /drive2
    tmpfs            16G  104K   16G   1% /tmp

    Is there any other log I'm not aware of that I could check for clues?
    You're right, that would be ideal! Unfortunately, though, I don't think it's that simple, at least not on iOS. You can't customize the sounds or the lengths of sounds played by an app's notification, unless the app developer specifically implements the ability to do so. All of the downtime alerts that I have seen just send a standard notification using the default sounds.
    Thanks for these! UptimePal is exactly what I'm looking for. :)

    I am also testing my own solution with New Relic and IFTTT. New Relic can send a webhook request to any URL you want when your server goes down, and IFTTT can call your phone. So, using the IFTTT Maker service, if a request comes in to my IFTTT Maker trigger URL, it calls my phone with a custom message that I set. I have set a custom ringtone to sound like a warning alarm for the IFTTT number. I wish it would keep calling me until I acknowledge the problem in some way, but this is a good alternative for now. :ROFLMAO:
     
  4. eva2000

    eva2000 Administrator Staff Member

    55,237
    12,253
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,833
    Local Time:
    6:57 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    To troubleshoot Nginx and PHP-FPM issues you'd want to check the domain site's vhost access.log and error.log logs located within directory at /home/nginx/domains/yourdomain.com/logs. You can see a full overview at centminmod.com/configfiles.html

    FAQ item 19 has more info on all Centmin Mod relevant log files locations and how to use tail command to view a sample of the entries.

    Ah Android user myself :D

    Haha creative :)
     
  5. deltahf

    deltahf Premium Member Premium Member

    587
    265
    63
    Jun 8, 2014
    Ratings:
    +489
    Local Time:
    4:57 AM
    I found another service — this one is free! — that will send requests to a web hook URL: https://uptimerobot.com

    Using this I should get at least two calls, one from New Relic and the other from Uptime Robot, if my server goes down.
     
  6. eva2000

    eva2000 Administrator Staff Member

    55,237
    12,253
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,833
    Local Time:
    6:57 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    Ah i use uptimerobot too didn't know it had that feature heh
     
  7. deltahf

    deltahf Premium Member Premium Member

    587
    265
    63
    Jun 8, 2014
    Ratings:
    +489
    Local Time:
    4:57 AM
    Have you configured any special alerts to notify you of downtime while you are sleeping?

    Wait a minute... You just never sleep, do you? :eek::eek::eek:
     
  8. eva2000

    eva2000 Administrator Staff Member

    55,237
    12,253
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,833
    Local Time:
    6:57 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    For my own sites not really. I setup DNS failover, so if forum is down, auto switches site to backup maintenance server with appropriate message Linode Fremont datacenter outage !. So you'll know what's up when you see that maintenance site heh :)

    For centminmod.com rarely will it be down as it's operated on a geo latency based dns cluster of servers so there's numerous fallback to other region mirrors which eventually will expand to load balancing in each geo region too :)
     
  9. Matt

    Matt Well-Known Member

    932
    415
    63
    May 25, 2014
    Rotherham, UK
    Ratings:
    +671
    Local Time:
    9:57 AM
    1.5.15
    MariaDB 10.2
    I have all my external monitoring configured to push the alerts to Pushover, which is configured to bypass the "do not disturb" feature on my phone while I'm sleeping, and makes one hell of a racket. It's never failed to wake me up, as the High Priority alert tone is bloody loud.
     
  10. eva2000

    eva2000 Administrator Staff Member

    55,237
    12,253
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,833
    Local Time:
    6:57 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    Indeed it does

    Android user as well. Should be same for Iphone?
     
  11. Kintaro

    Kintaro Member

    106
    11
    18
    Dec 2, 2016
    Italy
    Ratings:
    +30
    Local Time:
    10:57 AM
    1.15.x
    MariaDB 10
    maybe you can try with IFTTT.
     
  12. RB1

    RB1 Active Member

    292
    75
    28
    Nov 11, 2016
    California
    Ratings:
    +122
    Local Time:
    1:57 AM
    Nginx 1.21.x
    MariaDB 10.1.x
    How many monitoring "agents" do you guys have installed on your servers?
    I have New Relic, NodeQuery, and Linode Longview but I can't help but think what this will do to performance. Is it not really something to worry about due to minimal system resources or is it not a good idea to have so many running?
     
  13. eva2000

    eva2000 Administrator Staff Member

    55,237
    12,253
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,833
    Local Time:
    6:57 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    i have nodequery, nixstats, newrelic and linode longview :)

    no problems here :)

    each of those can report process resource usage so you can see for yourself
     
  14. BoostN

    BoostN Active Member

    134
    27
    28
    Aug 19, 2014
    Ratings:
    +42
    Local Time:
    3:57 AM
    1.13.6
    10.0.34
    I have uptimerobot post into my slack channel, that posts to my phone. I will admit, I wouldn't get woke up. But, I guess that's a change I'm taking.

    I try to keep my slack channel my central location for alerts, stats, etc.

    I might try to setup something with tasker, ring the phone if this happens. I would say I guess ifttt.com would probably have a solution as well.
     
  15. elargento

    elargento Member

    352
    17
    18
    Jan 4, 2016
    Ratings:
    +44
    Local Time:
    5:57 AM
    10
    Nice thread, love this community!

    How did you set up the other server to display a unique page for all websites?
     
  16. eva2000

    eva2000 Administrator Staff Member

    55,237
    12,253
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,833
    Local Time:
    6:57 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    i edited the main hostname's virtual host in /usr/local/nginx/html located index.html file tied to vhost /usr/local/nginx/conf/conf.d/virtual.conf which usually just displays the main hostname's centmin mode place holder index.html page

    upload_2017-4-19_7-28-0.png
     
  17. BamaStangGuy

    BamaStangGuy Active Member

    668
    192
    43
    May 25, 2014
    Ratings:
    +272
    Local Time:
    3:57 AM
    So I setup Pushover yesterday using Uptime Robot and I can attest to the notification sound being enough to wake up the dead. I almost rolled out of bed it scared the shit out of me. I used the Alien sound lol.
     
  18. eva2000

    eva2000 Administrator Staff Member

    55,237
    12,253
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,833
    Local Time:
    6:57 PM
    Nginx 1.27.x
    MariaDB 10.x/11.4+
    :LOL: at least you weren't in public with folks around :)
     
  19. Sunka

    Sunka Well-Known Member

    1,150
    325
    83
    Oct 31, 2015
    Pula, Croatia
    Ratings:
    +525
    Local Time:
    10:57 AM
    Nginx 1.17.9
    MariaDB 10.3.22
    I use nixstat pushover alert too, beside matt script with pushover alert
     
  20. elargento

    elargento Member

    352
    17
    18
    Jan 4, 2016
    Ratings:
    +44
    Local Time:
    5:57 AM
    10
    Few days ago I got a WHM email saying server was out of memory (which I think caused downtime). However uptimerobot never notified of this. Is is possible I was never notified because PING was set instead of HTTP monitoring?