Join the community today
Become a Member

Amazon AWS Problems restoring from image in AWS

Discussion in 'Virtual Private Server (VPS) hosting' started by fly, Aug 14, 2020.

  1. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    Please fill in any relevant information that applies to you:
    • CentOS Version: CentOS 7 64bit
    • Centmin Mod Version Installed: 123.09beta01
    So I have an odd issue doing an AWS image restore from a snapshot. I am unable to connect to any ports on the server, even though it is clearly up and running. I've tried SSH/HTTP from over the internet, as well as another instance on the same local network. The system log from the EC2 console looks good (I see all the services coming up), and I can see the login prompt when I grab a screenshot of the instance.

    Thoughts?
     
  2. eva2000

    eva2000 Administrator Staff Member

    45,151
    10,272
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +15,919
    Local Time:
    7:41 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    What type of image you restoring from and how was the image created ? Did you check to see if CSF Firewall is running ? As CSF Firewall is responsible for whitelisting ports for access - provided of course your AWS EC2 firewall has appropriate ports open too.
     
  3. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    Restoring from an AWS snapshot. Firewall is running, and working, on the live instance. EC2 security groups are also correct (however I even tried to SSH from another instance on the same network).

    Also, after seeing this didn't work on a client site, I checked from my own infrastructure and got the same result.
     
  4. eva2000

    eva2000 Administrator Staff Member

    45,151
    10,272
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +15,919
    Local Time:
    7:41 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    Shame AWS EC2 doesn't have out of band console/kvm access.

    Has AWS image restore ever worked for Centmin Mod servers ? Or is this the first time it hasn't worked ?

    I see you tried instance console access, what about troubleshooting docs ? As I stated before, Centmin Mod hasn't really been tested on AWS EC2 nor has it on Google Cloud, so probably a few gotchas somewhere.
    Also in your SSH client try enabling verbose/debug logging to see what happens when you try to connect.
     
    Last edited: Aug 14, 2020
  5. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    I've done restores to countless servers before, but this is my first time with Centmin Mod. As you first suggested, it seems likely that CSF doesn't like something. Generally, how does CSF respond when a new NIC is added (since that is sorta what is happening)?

    And nothing is happening in SSH. Telnet doesn't even work.
     
  6. eva2000

    eva2000 Administrator Staff Member

    45,151
    10,272
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +15,919
    Local Time:
    7:41 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    Should work out of the box on most network setups unless it's a NAT based private IP only VPS with no public IP address. CSF Firewall configuration for NIC/network is done at CSF Firewall install time

    from /etc/csf/csf.conf config file
    Code (Text):
    grep -C5 -i eth /etc/csf/csf.conf 
    
    ###############################################################################
    # SECTION:General Settings
    ###############################################################################
    # By default, csf will auto-configure iptables to filter all traffic except on
    # the loopback device. If you only want iptables rules applied to a specific
    # NIC, then list it here (e.g. eth1, or eth+)
    ETH_DEVICE = ""
    # By adding a device to this option, ip6tables can be configured only on the
    # specified device. Otherwise, ETH_DEVICE and then the default setting will be
    # used
    ETH6_DEVICE = ""
    # If you don't want iptables rules applied to specific NICs, then list them in
    # a comma separated list (e.g "eth1,eth2")
    ETH_DEVICE_SKIP = ""
    

    generally you don't need to configure these
     
  7. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    Well then, that can't be it! Can you think of anything else that might prevent even an instance on the same network from telneting to any port?

    edit: Just as a test, I created a base Centos instance > snapshot > image > restore and it worked.

    edit2: As another test, I updated the base Centos image and rebooted. Then did the above steps. It still works.
     
    Last edited: Aug 15, 2020
  8. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    To further the test, I took a brand new base CentOS, updated, and installed Centmin Mod. Then did the same snapshot > image > instance test. I was unable to connect to the restored instance.

    So to recap, I can restore a CentOS instance, but as soon as I install Centmin Mod, the restore fails (unable to telnet any port).
     
  9. eva2000

    eva2000 Administrator Staff Member

    45,151
    10,272
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +15,919
    Local Time:
    7:41 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    Unfortunately, as I said AWS EC2 hasn't been tested by me.

    Maybe first figure out if you have outbound network connectivity on image restore by setting up the Centmin Mod installed image to send you an email via postfix from within the AWS EC2 CentOS image on server reboot.

    If the test confirms that you do have outbound network connectivity on image restore, then you can write a script to collect system info, network, iptables, csf firewall status and save to a text file on reboot and send all that info to you via postfix with the saved text file as a mail attachment.
     
  10. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    Good idea on the email! The bad news is that I set it up and got no email, so the instance appears totally cut off. Any ideas on what could cause that?
     
  11. eva2000

    eva2000 Administrator Staff Member

    45,151
    10,272
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +15,919
    Local Time:
    7:41 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    A few common ones would be services installed/enable that fail to start on image restore and/or block csf firewall from starting up. Or if you tried to install Centmin Mod 123.09beta01 on EC2 with a sudo user it may not have properly installed all Centmin Mod software as it has limited sudo support as it expects a full root user not sudo user. So may need to install on Amazon EC2 which use sudo like
    Code (Text):
    sudo yum -y update; sudo curl -O https://centminmod.com/betainstaller73.sh && sudo chmod 0700 betainstaller73.sh && sudo bash betainstaller73.sh
    

    and still no guarantees as I do not test on Amazon EC2 or Google Cloud.

    Does Centmin Mod installed on Amazon EC2 survive a server reboot and network connectivity ? What about a shutdown and then start up ?
     
  12. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    Centmin Mod has worked flawlessly for me for at least a couple of years on AWS, until I noticed this restore issue.

    Also, I installed it via root user, so no sudo issues. I don't see any obvious errors about services not starting in the system log either.
     
  13. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    As another note, I enabled flow logs for the network interface and as expected I can see my SSH requests flowing into the NIC, but the instance just isn't replying.

    Also, I just noticed a grammatical issue. When you log in it says
    Technically, that should say 'may be' instead of maybe. I hope you don't mind me mentioning this.
     
  14. eva2000

    eva2000 Administrator Staff Member

    45,151
    10,272
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +15,919
    Local Time:
    7:41 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    Cheers on grammar issue :)

    is the network interface's name the same name on restored image as with original vps server name ?
     
  15. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    Although I can't log into the instance, I assume so. I checked a few of my CentOS instances and they all were ens5.

    Later today, I can build then restore a default CentOS instance to confirm its the same.
     
  16. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    I was able to restore a default CentOS image and the NIC was still ens5, so consider it confirmed. It does not change when restoring an image.
     
  17. eva2000

    eva2000 Administrator Staff Member

    45,151
    10,272
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +15,919
    Local Time:
    7:41 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    Not sure why, but did some EC2 tests for AWS Backup and Restore for Centmin Mod 123.09beta01 and it seems the issue might be unique to CentOS 7.8 AMI images and Amazon EC2 instances where the restored image needs cloud-init installed and the restored image needs it's network adapter's hardware MAC address reset.

    So you need to do this BEFORE making an AWS Backup initiated EC2 instance backup of Centmin Mod 123.09beta01 installed servers so as to remove the reference to original image's network adapter's hardware MAC address so that restored AWS Backup images re-setup the new EC2 instance which has a different hardware MAC address
    Code (Text):
    sudo -n sed -i '/HWADDR/d' /etc/sysconfig/network-scripts/ifcfg-eth* /etc/sysconfig/network-scripts/ifcfg-eth0*
    sudo sed -i '/UUID/d' /etc/sysconfig/network-scripts/ifcfg-eth* /etc/sysconfig/network-scripts/ifcfg-eth0*
    sudo rm -f /etc/udev/rules.d/70-persistent-net.rules
    sudo yum -y install cloud-init --disableplugin=versionlock,priorities --disableexcludes=main
    sudo yes | pip uninstall urllib3
    sudo yes | pip install --upgrade urllib3
    sudo systemctl disable cloud-{init-local,init,config,final}.service
    sudo systemctl enable cloud-{init-local,init,config,final}.service
    

    The clue is in EC instance settings system log where network service fails to start up so that's why it doesn't respond as network hasn't started.
    Code (Text):
    [�[1;31mFAILED�[0m] Failed to start LSB: Bring up/down networking.
    See 'systemctl status network.service' for details.
    [�[32m  OK  �[0m] Reached target Network.
    

    on working AWS Backup restored EC2 instance image, the system log shows network service and cloud-init services started
    Code (Text):
    [   14.054816] cloud-init[585]: Cloud-init v. 18.5 running 'init-local' at Mon, 24 Aug 2020 16:57:19 +0000. Up 14.02 seconds.
            Starting Hostname Service...
    [�[32m  OK  �[0m] Started Hostname Service.
    [�[32m  OK  �[0m] Started Initial cloud-init job (pre-networking).
    [�[32m  OK  �[0m] Reached target Network (Pre).
            Starting LSB: Bring up/down networking...
    [�[32m  OK  �[0m] Started LSB: Bring up/down networking.
            Starting Initial cloud-init job (metadata service crawler)...
    [�[32m  OK  �[0m] Reached target Network.
    

    If you like Ways To Support Centmin Mod ;) :)
     
  18. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    Interesting!

    I saw the LSB error in the system log, but figured it was a red herring since everything else worked in the standard CentOS AMI (I see that error in several working CentOS instances). Also, I saw that cloud-init didn't seem to work on the restored Centmin image, but it seemed to work fine in the restored CentOS images.

    Either way, I'll try this! Is it a safe assumption that these commands only need to be run once?
     
  19. eva2000

    eva2000 Administrator Staff Member

    45,151
    10,272
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +15,919
    Local Time:
    7:41 PM
    Nginx 1.17.x
    MariaDB 5.5/10.x
    could be related to python urlib3 hence the extra commands
    should be for the original Centmin Mod installed image but you can test it to see.
     
  20. fly

    fly Premium Member Premium Member

    30
    5
    8
    Jul 27, 2019
    Ratings:
    +9
    Local Time:
    5:41 AM
    This didn't initially work, but then I noticed that there were Centmin Mod updates as well. Once I ran those, everything worked! And I created some images of images as well, and all those seemed to work.

    That said, I still see the 'failed to start LSB' message in the system log. Anything else I might be missing?