Workaround for the vCenter Server appliance 5.1U1 update delay

The update process from 5.1.x to 5.1 Update 1 contains a serious flaw. The update may take more than 45 minutes, some report more than one hour. VMware even mentions this in their release notes:

Update of vCenter Server Appliance 5.1.x to vCenter Server Appliance 5.1 Update 1 halts at web UI while showing update status as installing updates*
When you attempt to upgrade vCenter Server Appliance 5.1.x to vCenter Server Appliance 5.1 Update 1, the update process halts for nearly an hour and the update status at Web UI shows as installing updates. However, eventually, the update completes successfully after an hour.

Workaround: None.
(http://www.vmware.com/support/vsphere5/doc/vsphere-vcenter-server-51u1-release-notes.html)

The generic update documentation KB article 2031331 “Updating vCenter Server Appliance 5.x” mentions even longer durations:

The update process can take approximately 90 to 120 minutes. Do not reboot until the update is complete.
(http://kb.vmware.com/kb/2031331)

Well, there is a workaround, even a very simple one:

  • log in to the appliance via SSH as root
  • execute “rm /usr/lib64/.lib*.hmac”
  • perform the update using the web UI

The update will take only a few minutes, in my case less than 10. The appliance needs to be rebooted and runs fine afterwards. Don’t worry about these files, they will be deleted during the update anyway.

The .hmac files contain hashes of /usr/lib64/libcrypto.so.0.9.8 and /usr/lib64/libssl.so.0.9.8 used for FIPS compliance. When the corresponding packages are updated, these files are not deleted immediately:

-r-xr-xr-x 1 root root 1685176 Jul 10  2012 /usr/lib64/libcrypto.so.0.9.8
-r-xr-xr-x 1 root root  343040 Jul 10  2012 /usr/lib64/libssl.so.0.9.8
-rw-r--r-- 1 root root      65 Jan 11  2012 /usr/lib64/.libcrypto.so.0.9.8.hmac
-rw-r--r-- 1 root root      65 Jan 11  2012 /usr/lib64/.libssl.so.0.9.8.hmac

The mismatch between libraries (binaries) and hashes causes all applications using OpenSSL to fail with messages like

fips.c(154): OpenSSL internal error, assertion failed: FATAL FIPS SELFTEST FAILURE

Regarding the appliance update the vami-sfcb fails to start, thus delaying the whole update process until the maximum retry limit for this service is reached. If the appliance is rebooted before this timeout, the postinstall phase was not executed and the vCenter will not start anymore. Either because of said OpenSSL error or because the vpxd does not start with the error message

Database version id ‘510’ is incompatible with this release of VirtualCenter.

I was able to revive the appliance in my lab, but this is of course neither supported nor recommended. It runs fine again, but the state is not consistent and I would always recommend to boot it just one more time to perform a migration to a fresh installation and save the configuration & data. Depending on when the update was interrupted, your results may vary.

If the appliance itself does not properly start anymore, boot it from a Linux live CD (GParted or Parted magic are sufficient), mount the filesystem and delete the .hmac files. Perform a normal boot afterwards.

If the web UI allows to do a normal update, do so, and you should be fine.

Otherwise try it manually (the following steps assume you’re familiar with Linux and you should check the prerequisites):

  • Log in to the appliance via SSH as root
  • cd /opt/vmware/var/lib/vami/update/data/job
  • cd to the latest subdirectory, which should have the highest number
  • Check if the update belongs to 5.1U1
    head manifest.xml
    You should see build 5.1.0.10000.
  • Attach the updaterepo ISO to the VM
  • mount /dev/sr0 /media/cdrom   (create if necessary)
  • cd /opt/vmware/var/lib/vami/update/data/package-pool
  • ln -s /media/cdrom/update/package-pool package-pool
  • cd back to the job subdirectory
  • ./pre_install ‘5.1.0.5300’ ‘5.1.0.10000’
  • ./test_command   (may report “failed dependencies”)
  • cp -p run_command run_repair
  • vi run_repair and change the first command from “rpm -Uv” to “rpm -Uv –no-deps –replacepkgs”
  • ./run_repair   (ignore “insserv: script jexec is broken” etc)
  • Check if a duplicate vfabric-tc-server-standard package exists
    rpm -q vfabric-tc-server-standard
  • If yes (more than one line of output), delete the older version, otherwise /usr/lib/vmware-vpx/rpmpatches.sh will fail
    rpm -e vfabric-tc-server-standard-2.6.4-1   (in my case)
  • ./post_install ‘5.1.0.5300’ ‘5.1.0.10000’ 0
  • ./manifest_update
  • That’s it basically, now just the cleanup
    cd /opt/vmware/var/lib/vami/update/data
    rm -r job/*
    rm cache/*
    package-pool/package-pool
    umount /media/cdrom
  • reboot

Be aware that most likely old versions of some packages will still be installed. Again: this is not a stable state, just (hopefully) enough to save your data… good luck!

16 thoughts on “Workaround for the vCenter Server appliance 5.1U1 update delay

  1. So this works great, but for some reason the main web pane for administration (The vCenter Server tab) is just blank. It no longer has the ability to see or change any of the settings. The other panes seem to work fine, but this update still seems broken even with your fix.

    • Hi Jason! I assume you’re talking about the admin UI (port 5480), right? Haven’t experienced that issue. Are the buttons (Summary, Database, SSO etc) still there, or is the tab completely blank?

      • Totally blank, the other tabs work correctly but the main tab with all the subsections is 100% empty. As this is where most of the admin functionality is located it seems a serious issue. Another user had the same results on the VMware thread where you outlined this. I’m disappointed with VMware in how badly they implemented this, I hope they have a better update soon.

  2. You just saved me from my trouble. My SAN on my lab was crashed and rebooted during the upgrade of my vCenter Appliance, which caused the update process to fail and locked up in the state as you mentioned. Using your method to delete the hmac files at least allowed me to boot up to the state past the vami-sfcb service and get to the normal admin UI and navigate around without problems, so I can restart the update.

    I should have taken a snapshot before performing the update, but you saved me from trouble. Thank you!

  3. Thanks for the advise on VMWARE communities, your advise saved my vcenter after the failed update, and everything has been running fine since.

    However, I have now tried to install the new update and I now get this error when I try to install the new update … my vCenter is still running, but I cannot install updates, what is my best option now ?

    (here is what I see in the VMware vCenter Server Appliance Update tab:

    Appliance Name: VMware vCenter Server Appliance
    Appliance Version: 5.1.0.10000 Build 1065184 ( Details… )

    VMware vCenter Server Appliance
    Update to version 5.1.0 Update 1

    Available Updates

    Appliance Version: 5.1.0.10200 Build 1235310 ( Details… )

    VMware vCenter Server Appliance
    Update to version 5.1.0 Update 1b

    Last Check: Thursday, August 22, 2013 5:00:04 AM GMT+02:00
    Last Install: “Failed to install updates(Error while running installation tests) on Thursday, August 22, 2013 2:12:29 AM GMT+02:00

    • Hi!
      Well, unless it is a lab or test environment I would strongly recommend to backup (export) the config and deploy a new VCA instance. Fortunately your VCA is running again, so this would be the easiest and safest way.
      If it’s rather for the fun of it or to gain experience: check /opt/vmware/var/log/vami/updatecli.log for errors. You could also follow the manual steps I posted and check the output of the “pre_install” and “test_command” scripts. This should give you some hints, which you could post here.
      If you did not perform a full cleanup, which means removing older package versions, you may need to do that as well. This would require the analysis of an “rpm -qa” output and some scripting. Worked fine for me, was fun 😉 and interesting. But as I said above: the easiest, safest & fastest way is to deploy a new VCA instance.
      Cheers,
      Martin

      • Well, it is never bad to learn how to fix errors instead of reinstalling 🙂

        I have a snapshot and would take a backup of the database before trying anything new on my vcenter.

        I do see a few errors in the updatecli.log and will post them later here.

      • 21/08/2013 22:10:49 [INFO] Starting Install
        21/08/2013 22:10:49 [INFO] Update status: Starting Install
        21/08/2013 22:10:49 [INFO] Update status: Running pre-install scripts
        21/08/2013 22:10:49 [INFO] Running /opt/vmware/var/lib/vami/update/data/job/188/pre_install ‘5.1.0.10000’ ‘5.1.0.10200’
        Installing update from version 5.1.0.10000 to version 5.1.0.10200
        Shutting down vami-sfcbd: done.
        Starting vami-sfcbd: done.
        Checking vami-sfcbd status:done.
        Testing vami-sfcbd: done.
        Initializing vami-sfcbd: done.
        Stopping VMware vSphere Profile-Driven Storage Service…
        Stopped VMware vSphere Profile-Driven Storage Service.
        Stopping tomcat: success
        Stopping vmware-vpxd: success
        Shutting down ldap-server..done
        21/08/2013 22:11:43 [INFO] Update status: Done pre-install scripts
        21/08/2013 22:11:43 [INFO] Update status: Running installation tests
        21/08/2013 22:11:43 [INFO] Running /opt/vmware/var/lib/vami/update/data/job/188/test_command
        warning: /opt/vmware/var/lib/vami/update/data/package-pool/package-pool/vmware-tools-esx-kmods-default-9.0.5-1.sles11.x86_64.rpm: Header V3 RSA/SHA1 signature: NOKEY, key ID 66fd4949
        Preparing packages for installation…
        package coreutils-8.12-6.23.1.x86_64 is already installed
        package libopenssl0_9_8-0.9.8j-0.44.1.x86_64 is already installed
        package libxml2-2.7.6-0.19.1.x86_64 is already installed
        package vmware-tools-foundation-9.0.5-1.sles11.x86_64 is already installed
        package libpython2_6-1_0-2.6.8-0.15.1.x86_64 is already installed
        package device-mapper-1.02.63-18.27.1.x86_64 is already installed
        package libcurl4-7.19.7-1.20.21.1.x86_64 is already installed
        package bind-libs-9.6ESVR7P4-0.8.1.x86_64 is already installed
        package klogd-1.4.1-708.39.1.x86_64 is already installed
        package sudo-1.7.6p2-0.2.8.1.x86_64 is already installed
        package ipxe-1.0.0-1.927114.vmw.i686 is already installed
        package bind-utils-9.6ESVR7P4-0.8.1.x86_64 is already installed
        package lvm2-2.02.84-3.33.1.x86_64 is already installed
        package syslog-ng-2.0.9-27.34.36.1.x86_64 is already installed
        package vmware-tools-esx-kmods-default-9.0.5-1.sles11.x86_64 is already installed
        package curl-7.19.7-1.20.21.1.x86_64 is already installed
        package kpartx-0.4.9-0.66.1.x86_64 is already installed
        package python-xml-2.6.8-0.15.1.x86_64 is already installed
        package vmware-tools-vmxnet-common-9.0.5-1.sles11.x86_64 is already installed
        package limal-ca-mgm-1.5.23-0.3.2.x86_64 is already installed
        package ntp-4.2.4p8-1.18.1.x86_64 is already installed
        package openldap2-2.4.26-0.16.1.x86_64 is already installed
        package coreutils-lang-8.12-6.23.1.x86_64 is already installed
        package vmware-studio-vami-service-administration-5.1.0-996361.x86_64 is already installed
        package perl-doc-5.10.0-64.57.1.x86_64 is already installed
        file /usr/lib/vmware-tools/sbin/vmware-checkvm from install of vmware-tools-foundation-9.0.5-1.sles11.x86_64 conflicts with file from package vmware-tools-foundation-9.0.5-1.sles11.x86_64
        file /usr/lib/vmware-tools/moduleScripts/vmxnet/vmware-config.pl from install of vmware-tools-vmxnet-common-9.0.5-1.sles11.x86_64 conflicts with file from package vmware-tools-vmxnet-common-9.0.5-1.sles11.x86_64
        file /usr/lib/vmware-tools/moduleScripts/vmxnet/vmware-db.pl from install of vmware-tools-vmxnet-common-9.0.5-1.sles11.x86_64 conflicts with file from package vmware-tools-vmxnet-common-9.0.5-1.sles11.x86_64
        file /usr/lib/vmware-tools/moduleScripts/vmxnet/vmware-deconfig.pl from install of vmware-tools-vmxnet-common-9.0.5-1.sles11.x86_64 conflicts with file from package vmware-tools-vmxnet-common-9.0.5-1.sles11.x86_64
        file /usr/lib/vmware-tools/moduleScripts/vmxnet/vmware-pmwv.pl from install of vmware-tools-vmxnet-common-9.0.5-1.sles11.x86_64 conflicts with file from package vmware-tools-vmxnet-common-9.0.5-1.sles11.x86_64
        21/08/2013 22:11:43 [ERROR] Failed with exit code 40960
        21/08/2013 22:11:43 [INFO] Update status: Running post-install scripts
        21/08/2013 22:11:43 [INFO] Running /opt/vmware/var/lib/vami/update/data/job/188/post_install ‘5.1.0.10000’ ‘5.1.0.10200’ 2
        Update web ui help URL
        Configuring lighthttpd
        Trying to patch /etc/openldap/schema/core.schema. Ignore the errors if the openldap2 rpm was not updated
        patching file /etc/openldap/schema/core.schema
        Hunk #1 FAILED at 119.
        Hunk #2 FAILED at 398.
        2 out of 2 hunks FAILED — saving rejects to file /etc/openldap/schema/core.schema.rej
        Ignore the patch errors above if the openldap2 rpm was not updated
        Customize grub menu.lst
        Setting global java options
        Fixing likewise configuration
        SUCCESS
        SUCCESS
        rm: cannot remove `/etc/pam.d/*lwidentity*’: No such file or directory
        rm: cannot remove `/etc/nsswitch*lwidentity*’: No such file or directory
        Patch slow vami cd mounting
        patching file vami_ovf_process
        Hunk #1 FAILED at 249.
        1 out of 1 hunk FAILED — saving rejects to file –
        Waiting for the embedded database to start up: .success
        Verifying EULA acceptance: success
        Executing pre-startup scripts…
        Updating VPXD record in the LS.
        Intializing registration provider…
        Getting SSL certificates for https://xxxxx:7444/lookupservice/sdk
        Getting SSL certificates for https://xxxxx:7444/sso-adminserver/sdk
        Getting SSL certificates for https://xxxxx:7444/ims/STSService
        Service with name ‘vpxd-vcenter.xxxxx-4be6bdd2-f945-47db-8f9e-9eb5487995e9’ and ID ‘local:5’ was updated.
        Return code is: Success
        Starting ldap-server..done
        Starting vmware-vpxd: success
        Waiting for vpxd to initialize: .success
        Starting tomcat: success
        Executing startup scripts…
        Autodeploy service is disabled, skipping registration.
        Starting VMware vSphere Profile-Driven Storage Service…Waiting for VMware vSphere Profile-Driven Storage Service……
        VMware vSphere Profile-Driven Storage Service started.

        Failed with status of 2 while installing version 5.1.0.10200
        VM version is still 5.1.0.10000
        21/08/2013 22:12:20 [INFO] Update status: Done post-install scripts
        21/08/2013 22:12:20 [INFO] Update status: Running VMware tools reconfiguration
        21/08/2013 22:12:20 [INFO] Running /opt/vmware/share/vami/vami_reconfigure_tools
        VMware tools is not installed on this VM.
        21/08/2013 22:12:20 [INFO] Update status: Done VMware tools reconfiguration
        21/08/2013 22:12:20 [INFO] Update status: Error while running installation tests
        21/08/2013 22:12:20 [ERROR] Failure: updatecli exiting abnormally
        21/08/2013 22:12:20 [INFO] Install Finished
        Version – 5.1.0.10000 Build 1065184
        Description –
        VMware vCenter Server Appliance
        Update to version 5.1.0 Update 1

  4. Yeah, just like expected, there’s a package conflict left. Check /opt/vmware/var/lib/vami/update/data/job/188/test_command and run_command for any calls to rpm, and add the parameters “–replacepkgs –replacefiles”. Which means you have to do the update manually.
    Second way, but I never tested that myself, would be to manually install the packages which cause the “conflict” errors beforehand. Mount the updaterepo ISO, locate the RPMs, and run “rpm -Uvh –nodeps –replacepkgs –replacefiles [rpm file]”. The next time you try the update you should see messages likes “already installed” (which is fine), but no conflicts anymore. Hopefully. 😉

  5. So would your suggestion be something like this ?

    1. backup the database
    2. change test_command and run_command to add the parameters “–replacepkgs –replacefiles
    3. execute test_command
    4. execute run_command
    5. reboot
    6. try the normal update again

    • Sorry for the long delay, was busy…
      Yep, that would be the idea. If everything goes well, you just keep the VCA, otherwise you deploy a new one and restore the backup.
      If seen the RPM list, but I think that’s not useful as a comment here, which means for other readers.
      Let me know if the repair process worked for you meanwhile.

      • I wanted to reply and say that this repair process, while a bit different, did work for me.

        I’d been unsuccessfully trying to upgrade a VCSA 5.5.0.20100 install to 2e (20500) with no success. (I’d been receiving the “Error while running installation tests” message.)

        I followed the instructions here, adapting the build numbers for the version of 5.5 I was running. One other minor change, “-no-deps” has now become “–nodeps” in the version of RPM included.

        I proceeded with the package upgrade, and rebooted the appliance, after which, I was able to apply the upgrade with no issues.

        I’ll be monitoring it over the next week while I keep a snapshot on file. So far though, everything looks green. Thanks for the share!

  6. Hi,
    I know an old post, yours worked as the update and and left me with this error.
    Error: Incompatible DB schema version.
    But even after your fix I still get the same error.
    vCenter server is now down.
    Any ideas?
    Thanks

    5.1 to 5.1 update 3
    Currently shows:VMware vCenter Server Appliance
    Update to version 5.1.0 Update 3b

Leave a Reply

Your email address will not be published. Required fields are marked *

*

code