vSphere 6.0 has been around for about a year now, but VMware's largest customers are usually one or even two versions behind. With the recent release of Update 2 it looks like the 6.0 version has gained the stability and maturity that enterprise customers are waiting for. This is the reason why I just did extensive testing of vSphere 5.5 to 6.0 upgrades in the lab.
The main challenge of such an upgrade is to transform your vCenter Single Sign-On (SSO) setup into a topology that is fully supported and not deprecated. With vSphere 6.0 the SSO component is now part of the new Platform Services Controller (PSC) role that can be separated from the remaining vCenter services. In fact VMware recommends doing this whenever you want to have two or more vCenter servers in the same SSO domain which is a prerequisite for the new Enhanced Linked mode. The separation of SSO was already possible with vSphere 5.5 (although only with the Windows version of vCenter, but not the VCSA 5.5), but I think most people wanted to keep it simple and installed all vCenter services on the same server. So if they have multiple vCenter servers installed in this way then they need to switch to one or more external PSCs now.
There are many ways and orders in that you can - on the one hand - upgrade all components to 6.0 and - on the other hand - switch to an external PSC/SSO model. But only few of them are documented and supported by VMware. Their general recommendation is to transform into a supported topology first, and then do the upgrade of the PSCs and vCenter servers. KB2130433 e.g. describes how to upgrade/migrate two vCenter 5.5 servers with embedded SSO into the same SSO domain. This and other migration scenarios involve re-pointing your vCenter 5.5 server to a newly installed external SSO 5.5 instance.
So when preparing the upgrade of a complex vSphere 5.5 environment to 6.0 you will sooner or later stumble over KB2033620 which describes how to do this re-pointing. Unfortunately this KB article and the tools that it refers you to are very poorly written and full of issues. Some of them are mentioned in the KB article itself with workarounds to follow, but a lot are not ... Here is a list of the most annoying issues with KB2033620 and how to fix them.
Snapshots to the rescue ...
In the beginning of the KB article you are advised to take a snapshot (or backup) of all involved vCenter server VMs. Do yourself a favor and follow this advice! Some of the re-pointing steps can go wrong in ways that make it very hard to return to a consistent state - unless you have a snapshot that you can just revert to!
Step 1: Remove the Inventory Service account
As a note states this is only necessary if you re-point the Inventory service to the same SSO instance that it was registered to before (because it was e.g. restored from a backup). In migration to 6.0 scenarios you usually do not need to do this, because you re-point to a newly installed external SSO instance.
Nevertheless this step works as described. No issues here.
Step 2: Re-register vCenter Inventory Service with vCenter Single Sign-On
This step uses the is-change-sso.bat script, and that has only a minor issue: it is supposed to restart the Inventory service, but fails to do so. As advised in the KB artice do this manually by running net stop vimQueryService followed by net start vimQueryService.
Step 3: Register vCenter Server with a different vCenter Single Sign-On instance
This is where most of the issues occur. Some of them are already mentioned in the KB article, but not all: First, the article tells you that you need to add the --openssl-path parameter to the repoint.cmd command to specify where the openssl executable and DLLs are located. This just does not work, no matter what path you choose and how you specify it!
To work around this
a) copy the openssl.exe file and the three DLLs that it needs from the directory "C:\Program Files\VMware\Infrastructure\Inventory Service\bin" to the directory where you unpacked the sso_svccfg.zip file. It should then look like this:
sso_svccfg.zip and openssl binaries in C:\TEMP |
If you are re-pointing to a freshly installed vCenter 5.5 SSO server that uses the default self-signed certificates then you will most likely run into an "InternalError / 254":
InternalError / 254 when re-pointing the vCenter service |
The utility pulls the CA certificate used by the SSO server and tries to save it into a local file that it names after the subject field of the certificate. Unfortunately the file name that it comes up with includes backslash characters (\). But in Windows the backslash cannot be used in file names, because it is used as the path separator. Basically this means that the tool tries to create a file named "=local" in a directory that just does not exist, and in our example this directory is named "C:\ProgramData\VMware\SSL\C=US,CN=CA\, CN\=SSO01\, dc\=vsphere\,dc"
Now that we know this it is easy to fix. You just need to create the expected directory structure by running mkdir with the path that I highlighted in the example (from C:\ until the last backslash):
You can then just re-run the repoint.cmd script exactly as before, and it will succeed. You do not need to return to a saved snapshot before the retry!
However, just like the is-change-sso.bat script the repoint.cmd script also fails to properly restart the involved vCenter services. So you need to manually restart the "VMware VirtualCenter Server" and the "VMware VirtualCenter Management Webservices".
Next issue is that the vCenter service might fail to restart after this step because of the issue described in KB2048753 (invalid certificate and privateKey entries in vpxd.cfg). This most likely occurs if you are using custom certificates for the vCenter service. Follow KB2048753 for a fix.
4. Re-register vCenter Server with the Inventory Service
For our upgrade/migration scenario this step is not needed! Please carefully read the relevant section of the KB article to learn when it is really needed.
Good news is that it always ran without any issues during my tests, and that it does not harm to run it even if it is not needed.
5. Register the vSphere Web Client with a different vCenter Single Sign-On instance
This step always ran without any issues during my tests. It even correctly restarts the Log Browser and the Web Client services.
I hope that this little guide will help a lot of people when re-pointing their vCenter 5.5 servers. Especially a proper fix for the "Internal Error / 254" was impossible to find in publicly available sources at the time when I wrote this post. I really needed to open a support request with VMware to find out ... Please post a comment here if you have something to add or ask!
This post first appeared on the VMware Front Experience Blog and was written by Andreas Peetz.
Follow him on Twitter to keep up to date with what he posts.
very good post! We ran into the same issues during our upgrade. I'd like to add that you need to remove linked mode when using two vCenter servers before re-registering to the new external SSO instance. We had to do a rollback to the previous snapshot as we ran into some issue during the re-registration process. This is mentioned in KB2033620 but just as a side note.
ReplyDeleteA second issue was that certificate warning messages appeared after vCenter 6.0 U2 upgrade even though we installed custom CA signed certificates on the external SSO 5.5 instances. Pointing Certificate Manager again to the same CA signed certificates for SSO fixed this.
if you stand up your new SSO environment using 5.1 first, then upgrade it to 5.5, and then repoint, it should work. 5.1 will use certificates with subjects that will work with the repointing script, whereas 5.5 apparently does not. It's also important to note that if you are have a multi-site or HA SSO environment, then SSO 5.1 must be installed on all nodes prior to upgrading them to 5.5, otherwise the fresh 5.5 install on the additional nodes will still use the bad certs.
ReplyDeleteOh and thanks for the informative post :-)
ReplyDelete