VMware Front Experience: August 2011

How ESXi-Customizer supports ESXi 5.0 - FAQ

by Andreas Peetz at Sunday, August 28, 2011

I got a lot of feedback after posting the new ESXi-Customizer (with support for ESXi 5.0) and the "anatomy"-article explaining its technical background. It looks like I haven't been clear enough on some points and need to provide some additional information. So here is a list of frequently asked questions (FAQ), I might update it from time to time, so stay tuned.

1. Can I use existing drivers (made for ESXi 4.x) for customizing ESXi 5.0?

No, you can't. Driver binaries compiled for ESXi 4.x are not compatible with ESXi 5.0. They just won't be loaded. Instead vmkload_mod will throw the error message "Module does not provide a license tag".

2. What input does ESXi-Customizer expect for customizing ESXi 5.0?

It expects a gzip-compessed tar-file (with extension .tgz) that includes exactly three files:

/usr/lib/vmware/vmkmod/<driver-module> (the binary driver module)
/etc/vmware/driver.map.d/<driver-name>.map (maps PCI device IDs to the binary module)
/usr/share/hwdata/driver.pciids.d/<driver-name>.ids (maps PCI device IDs to display names)

Nothing more is needed. All other steps outlined in the "anatomy"-post will be done by ESXi-Customizer.

3. When and where will ESXi 5.0 compatible community drivers be available?

ESXi device drivers are derived from device drivers written for the Linux kernel. However, it is necessary to make specific changes to the source code of a stock Linux driver to turn it into an ESXi driver. An experienced Linux developer can find out what changes are necessary by studying the complete source of the existing ESXi drivers that are shipped with ESXi 5.0.

The source code of these drivers has not yet been published by VMware. However, they are obliged to do this (sooner or later), because most of the original Linux drivers are licensed under the GNU GPL requiring that the source code of derived works also needs to be publicly available.
So, we need to wait for VMware to publish the OpenSource code of its drivers (we can expect it here), and for some knowledgeable people to compile new ESXi 5.0 compatible drivers then.

I am confident that this will happen in the near future. And I expect the new drivers to become available at Dave Mishchenko's vm-help.com, the home of the ESXi Whitebox HCL.

4. Does ESX-Customizer support creating a bootable USB-key with ESXi 5.0?

No, it does not. If the machine that you want to install ESXi on does not have a CD-ROM drive, you can help yourself by installing ESXi 5.0 using any other machine (that has a CD-ROM drive) onto a USB key drive. Once you have a bootable USB key you can use that to also boot any other machine!
The easiest and safest method is to use a virtual machine provided by VMware Workstation or VMware player to do the initial install. Yes, ESXi 5.0 can be installed in a VMware Workstation VM - just select "ESX Server 4" as the guest OS type.

The anatomy of the ESXi 5.0 installation CD - and how to customize it

by Andreas Peetz at Thursday, August 25, 2011

1. Introduction

With vSphere 5 VMware introduced the Auto Deploy Server and the Image Builder that allow to customize the ESXi installation ISO with partner supplied driver and tools packages.
The Image Builder is a Powershell snapin that comes with the latest version of the PowerCLI package. It allows to add software packages to a pre-defined set of packages (a so-called ImageProfile) and even lets you create an installation ISO from such a baseline making it easier than ever to customize the ESXi installation.

However, doing this is not a straight-forward task. It requires a working installation of the Powershell, plus the PowerCLI software, access to the offline-bundle that makes up the base installation (which is not included with the free version of ESXi!), a custom driver in VIB format, and some guidance on what Powershell-cmdlets you need to use to add the custom driver package and build an ISO from it.
For the developers of custom drivers it requires to supply their packages in VIB format, and it's not trivial and costs extra effort to build such a package (compared to a simple OEM.TGZ file).

I wondered if it is still possible to customize the ESXi 5.0 install ISO with a simple OEM.TGZ file like you can do with ESXi 4.1, e.g. with my ESXi-Customizer script. And yes, it is possible - but it's very different now! I want to provide some background information here on how this works:

2. The contents of the ESXi 5.0 installation ISO

First let's have a look at the root directory of the ESXi 5.0 install ISO:

Contents of the ESXi 4.0 install CD root directory

Unlike the ESXi 4.1 ISO you can see lots of ISO9660-compatible file names here (all capitals and 8.3-format). You can guess from their names that the files with the V00 (and V01, V02, etc.) extensions are device driver archives. The original type of these files is VGZ, the short form of VMTAR.GZ. That means that they are gzip'ed vmtar-files.

vmtar is a VMware proprietary variant of tar, and you need the vmtar-tool to pack and unpack vmtar archives. It is part of ESXi 5.0 and also ESXi 4.x. Other files have the extensions TGZ and T00 (like TOOLS.T00). These files are gzip'ed standard tar files that the boot loader can also handle. Good.

Comparing with the ESXi 4.1 media you will notice that there is no ddimage.bz2 file any more. In earlier versions of ESXi this is a compressed image that is written to the installation target disk and contains the whole installed ESXi system. Actually you can write this image to a USB key drive to produce a bootable ESXi system without ever booting the install CD. You cannot do this with ESXi 5.0 any more. However, customizing the install CD has become easier this way, because you do not need to add a second copy of your oem.tgz file to this system image.

There are also files named ISOLINUX.BIN and ISOLINUX.CFG in the ISO root. That means that ESXi 5.0 still uses the isolinux boot loader to make the installation CD bootable. If you look into ISOLINUX.CFG it includes a reference to the file BOOT.CFG, and in BOOT.CFG you find references to all the VGZ and TGZ files:

Contents of the BOOT.CFG file

A second copy of the BOOT.CFG file is in the directory \EFI\BOOT. The ESXi 5.0 install ISO (and ESXi 5.0 itself) was built to boot not only on a standard x86 BIOS, but also on new (U)EFI enabled BIOS versions. Just one thing to remember: If you change the one BOOT.CFG you better make the same change to the other.

Now let's have a closer look at a driver VGZ package.

3. What's in a driver's vgz-file?

As mentioned before you need the vmtar-tool to look into a VGZ-file. Since it is only part of ESXi itself you need to have access to an installed copy of ESXi (either 4.1 or 5.0). Luckily you are able to install ESXi 4.1 (and also 5.0!) inside a VMware Workstation 7 VM.

I did this by creating a VM of type "ESX Server 4" with typical settings except for the size of the virtual disk (2GB is enough for ESXi) and installing ESXi 5.0 in it. During installation the driver files from the CD root are uncompressed and copied to the directory /tardisks, so here is where you can find them again. After enabling the local shell (luckily still available with 5.0) I logged in and was finally able to look inside and unpack such a driver archive using the vmtar tool:

Unpacking NET-E100.V00 with the vmtar tool

So there are basically three files in the archive:

1. The driver binary module (with no file name extension, e1000 in this example) that will be unpacked to the well known location /usr/lib/vmware/vmkmod.

2. A text file that maps PCI device IDs to the included driver:

Contents of /etc/vmware/driver.map.d/e1000.map

3. Another text file that maps PCI IDs to vendor and device descriptive names:

Contents of /usr/share/hwdata/driver.pciids.d/e1000.ids

It is good to know that the PCI ID mapping files are now separated by driver. In ESXi 4.1 there is a single pci.ids file and a single simple.map file for all drivers which raised the potential of having conflicting copies of these files in case you merged multiple OEM drivers into the image.

It looks easy now to add a custom driver to the install CD: Just create a tgz-file containing the three files mentioned above, copy it to the ISO root directory and add its name to the two BOOT.CFG files. And yes, this will indeed work for the CD boot! The custom driver will be loaded and you will be able to install ESXi, ... but the installation routine will not copy the tgz-file to the install media, and if you boot the installed system the first time it will behave like a regular install without the custom driver.

So, there is more to it...

4. The image database IMGDB.TGZ

There is a file named IMGDB.TGZ in the root directory of the CD that is also listed in the BOOT.CFG files and has the following contents:

Unpacking the IMGDB.TGZ file

It contains files that will be unpacked to the directory /var/db/esximg. For each driver (or other software package) an XML-file is created under the vibs sub directory. There are a lot more of these files than shown here (I fiddled the output with "..."), one example is net-e1000--925314997.xml for the e1000 driver. Let's look into this file:

The contents of net-e1000--925314997.xml

The xml-file contains information about the package including possible dependencies on other packages and a list of all included files. Its file name ("net-e1000--925314997.xml") consists of the name element plus a (probably) unique number with 9 or 10 digits. The list of payloads is the list of included archive files (either of type vgz or tgz), in most cases it's just one. The name of the payload is limited to 8 characters ("net-e100" in this case) and is the name of the corresponding file in the CD's root directory. The extension of this file is expected to be ".v00" if the file is of type vgz and ".t00" if the file is of type tgz. If there are name conflicts with other packages the number in the extension is counted up. E.g. the payload file for the e1000e driver is "net-e100.v01".

Then there is the host image profile XML file in the directory /var/db/esximg/profiles. In our example this is the file ESXi-5.0.0-381646-standard1293795055. Let's look into this one:

... ... ... (lot more <vib></vib> entries cutted) ... ... ...

Contents of the host image profile XML file

Here we find a list of all vib-packages that make up the currently installed system. Please note that the vib-id of a package strictly corresponds to the element values that are in the associated vib xml file (see picture before), it is composed the following way:

<vendor>_<type>_<name>_<version>

So the vib-id element of the net-e1000 driver e.g. is

VMware_bootbank_net-e1000_8.0.3.1-2vmw.0.0.383646

The payload names that are listed in the image profile file are the same as in the distinct vib xml files with the exception that here the exact file names (e.g. "net-e100.v00") are listed rather than just the file type (vgz or tgz).

Conclusion: If we want to add a custom driver to the install CD we need to do the following (in addition to the steps described in section 3.): modify the contents of IMGDB.TGZ, add a vib xml file for the driver (similar to net-e1000...xml) to it and update the contained image profile file to include the driver as an additional <vib>-entry.

There is another particular XML element in both the vib files and image profile file that we need to take care of: the <acceptancelevel>. VMware distinguishes four different acceptance levels: VMwareCertified, VMwareAccepted, PartnerSupported and CommunitySupported, in the XML files they are coded as certified, vmware, partner and community. The names are pretty self-explanatory, and one can easily guess that certified is stricter than vmware that is stricter than partner that in turn is stricter than community. In other words: If the host image profile is of acceptance level certified only packages of the same acceptance level can be part of it. If it is of acceptance level vmware only VMware certified and VMware accepted packages can be installed. If it is of acceptance level partner (and this is the default!) partner supported packages can be installed in addition to that. The least restrictive level is community that would accept all four types of packages.

My expectation is that custom drivers for whitebox hardware are community supported (unless they are published by a hardware vendor company). However, if the driver's vib file contains the acceptance level community the image profile's acceptance level must also be changed to community. Otherwise the installation of the package will fail.

5. Can we automate it?

Yes, we can! The latest version of ESXi-Customizer does automate all the steps described here to add custom drivers in tgz-format to an ESXi 5.0 install ISO. You only need to feed it with a tgz-file that contains the three files listed in section 3 of this post.

Please note: Packages made for earlier ESXi versions will not work with ESXi 5.0, not only because the directory structure has changed, but also because the earlier versions' driver modules won't be loaded by the new version! And - at the time of this writing - there are probably no oem.tgz-style driver packages available that are compatible with ESXi 5.0!
Hopefully, this will soon change. If you are looking for a driver of a device that does not work out-of-the-box with ESXi 5.0 check the Unofficial Whitebox HCL at vm-help.com.

How to throttle that disk I/O hog

by Andreas Peetz at Tuesday, August 23, 2011

We are in the middle of a large server virtualization project and are utilizing two Clariion CX-400 arrays as target storage systems. The load on these arrays is increasing while we are putting more and more VMs on them. This is somewhat expected, but recently we noticed an unusual and unexpected drop of performance on one of the CX-400s. The load on its storage processors went way up and its cache was quickly and repeatedly filled up to 100% causing so-called forced flushes: That means the array needs to shortly stop any I/O coming in while it is staging the cache contents down to the hard disks in order to free the cache up again. As a result overall latency went up and throughput went down, and this affected every VM on every LUN of this array!

As the root cause of this we identified a single VM that fired up to 50.000(!) write I/Os per second. It was a MS SQL server machine that we recently virtualized. When it was on physical hardware it used locally attached hard disks that were never able to provide this amount of I/O capacity, but now - being a VM on high-performance SAN storage - it took every I/O it could have, monopolizing the storage array's cache and bringing it to its knees.

We found that we urgently needed to throttle that disk I/O hog, or it would severely impact the whole environment's performance. There are several means to prioritize disk I/O in a vSphere environment: You can use disk shares to distribute available I/Os among VMs running on the same host. This did not not help here: the host that ran the VM had no reason to throttle it, because the other VMs it was running did not require lots of I/Os at the same time. So, for the host there was no real need to fairly distribute the available resources.
Storage I/O Control (SIOC) is a rather new feature that allows for I/O prioritization at the datastore level. It utilizes the vCenter server's view on datastore performance (rather than a single host's view) and kicks in when a datastore's latency raises over a defined threshold (30ms by default). It will then adapt the I/O queue depth's of all VMs that are on this datastore according to the shares you have defined for them. Nice feature, but it did not help here either, because the I/O hog had a datastore on its own and was not competing with other VMs from a SIOC perspective ...

We needed a way to throttle the VM's I/O absolutely, not relatively to other VMs. Luckily there really is a way to do exactly this: It is documented in KB1038241 "Limiting disk I/O from a specific virtual machine". There are VM advanced-configuration parameters described here that allow to set absolute throughput caps and bandwidth caps on a VM's virtual disks. We did this and it really helped to throttle the VM and restore overall system performance!

By the way, the KB article describes how to change the VM's advanced configuration by using the vSphere client which requires that the VM is powered off. However, there is a way to do this without powering the VM off. Since this can be handy in a lot of situations I added a description of how to do this on the HowTo page.

Update (2011-08-30): In the comments of this post Didier Pironet pointed out that there are some oddities with using this feature and refers to his blog post Limiting Disk I/O From A Specific Virtual Machine. It features a nice video demonstrating the effect of disk throttling. Let me summarize his findings and add another interesting information that was clarified and confirmed by VMware Support:

Unlike stated in KB1038241 you can also specify IOps or Bps values (not only K, M or GIOps resp. K, M or GBps) for the caps (e.g. "500IOps"). If you do not specify a unit at all IOps resp. Bps is assumed, not KIOps/KBps like stated in the article.
The throughput cap can also be specified through the vSphere client (see VM properties / Resources / Disk), but not the bandwidth cap. This can even be done while the machine is powered on, and the change will become immediately effective.
And now the part that is the least intuitive: Although you specify the limits per virtual disk the scheduler will manage and enforce the limits on a per datastore(!) basis. That means:

If the VM has multiple virtual disks on the same datastore, and you want to limit one of them, then you must specify limits (of the same type, throughput or bandwidth) for all the virtual disks that are on the same datastore. If you don't do this, no limit will be enforced.
The scheduler will add up the limits of all virtual disks that are on the same datastore and will then limit them altogether by this sum of their limits. This explains Didier's finding that a single disk is limited to 150IOps although he defined a limit of 100IOps for this disk, but another limit of 50IOps for a second disk that was on the same datastore.
So, if you want to enforce a specific limit to only a single virtual disk then you need to put that disk on a datastore where no other disks of the VM are stored.

[Update] ESXi-Customizer 1.2 - another bugfix release

by Andreas Peetz at Wednesday, August 10, 2011

If you used the Advanced edit mode with ESXi-Customizer 1.0 or 1.1 and got a "Corrupt boot image" message in ESXi (either when booting the customized ISO or after having installed with it) ... this was caused by a corruption of the OEM.tgz file while re-packaging it.

It was very hard to find a Windows version of tar that produces tar archives which are fully compatible with ESXi. But (I hope) I have finally found one: a Windows port of busybox. Since ESXi uses busybox, too, this should guarantee maximum compatibility. If you ever wondered what a Windows port of busybox could be good for ... now you know ;-)

Please update to version 1.2 that incorporates this fix, and let me know if you are still struck by this bug! Please download it from the project page!

vSphere 5: release date rumors and licensing changes

by Andreas Peetz at Saturday, August 06, 2011

From what I have heard the originally targeted release date for VMware's vSphere 5 was August 5th. Now this has passed and it did not happen. There are now rumors ongoing that it will be released on August 22nd (see source)...
I don't know why it is being delayed. One possible reason is the change in licensing that was announced on August 3rd (see VMware's Power of Partnership Blog). With the revelation of vSphere 5 on July 12th VMware introduced a new licensing method based on vRAM (the amount of RAM allocated to running VMs) which lead to a storm of protest among customers and partners, especially because of the low amount of vRAM per physical CPU that was originally communicated. With the announcement above VMware has doubled this entitlement for most vSphere editions and they also capped the accountable vRAM for a single VM to 96GB (even if it has more RAM than that).
This will definitely help to speed up the adoption of vSphere 5 ... once it is released.

Update (2011-08-23): Okay, nothing again ... So it will probably happen on Friday (August 26th), just before VMworld 2011 (starting on Monday 29th).

Update (2011-08-25): It is out now, the official release date was August 24th. Customers with subscription go here to download. The free ESXi version is available here.