May, 2007
by Michael A. Salsburg
On January 31, 2007, VMware published a paper called "A Performance Comparison of Hypervisors1". This good doctor could not help but grin. VMware includes in its EULA (End User License Agreement) a prohibition for any licensee to publish benchmark results without VMware's approval. And here they were slamming Xen with benchmarks. It was practically rusty with irony. Obviously, the zealots in the Xen camp - XenSource - filed a return salvo on March 26, 2007 called "A Performance Comparison of Commercial Hypervisors"2 Who knew that virtualization technologies could be so entertaining?
The obvious benefactors of this major cat fight are the technicians and especially the performance management specialists. These benchmarks give us a lot of insight into what the performance issues are for server virtualization, as well as a discussion of the technology with a focus on the attributes that affect performance. I highly suggest that you read both of these, but if you only read one, I suggest the XenSource one. This paper has the advantage of providing updated results and represents a more real-world implementation of Xen. But the primary reason for this paper is that it presents a brief but useful architectural discussion of how the two technologies differ.
VMware ESX uses an "Emulation and Binary Translation" approach, where Xen uses a "Paravirtualized" approach. This is probably due to the main OS that is hosted by each of these technologies. VMWare has mostly focused on hosting Microsoft-based operating systems, such as Windows and NT. Xen has mostly focused on hosting Linux guests. For VMware, changing the Windows operating system was out of the question. For Xen, minor tweaks to the Linux kernel are acceptable. This approach is indeed one of the benefits of using "Open Source".
The XenSource paper also points out a significant problem with the VMware testing of Xen. As with open source, there is a standard software stream, from which vendors create their own packages. The Open Source community is a not-for-profit company. On the other hand, XenSource is a vendor who provides additional packaging with value-added optimized components around the Xen software stream and provides a virtualization solution that they claim is enterprise-ready. The version of Xen that was used for benchmarking by VMware was taken from the open source code stream, while the version used by XenSource was a product that is released by XenSource that includes modifications for enterprise-class computing and optimized performance. Due to their alliance with Microsoft, the XenSource Xen Tools for Windows provides additional performance improvements that are focused on Windows. Specifically, they provide optimized paravirtualization for Windows, including unique device drivers. This is the main difference between the results VMware published and the results that XenSource published.
Regarding server virtualization in general, the I/O path is a critical area where differences can have a big impact on performance. The papers do not discuss this technical area in sufficient detail, so it will be mentioned here before discussing the benchmark results.
When a call is made to do a physical I/O, the hypervisor emulates the device. This includes the following types of interactions:
The two diagrams below come from an excellent article by Intel on this subject.3
With pure emulation, the device drivers can be supported without modification. This allows the use of legacy device drivers. But the trade-off is performance. Each interaction of the OS to a device causes a number of transitions into the hypervisor. Another detractor from this approach is that the emulation must be very accurate. This could include known bugs for a specific driver release.
Although VMware totally emulates the OS and supports them unmodified, both Xen and VMware use "paravirtualized" device drivers. Essentially, the device drivers within the guests are modified to provide a better performing interface between the guest OS and the hypervisor. The disadvantage is that there is no support for legacy device drivers.
The benchmarks described in these two articles are:
This article will not go through the gory details. Instead, it will focus on a few unique performance issues that cropped up in the midst of battle.
Each of the benchmark results is expressed in terms of how it behaves relative to running on the "bare metal" (without virtualization). In the SPECcpu2000 integer tests, Xen average overhead was 2% higher for VMware ESX and 3% higher for Xen. The XenSource folks attribute this to the fact that ESX caches pages. Whereas this may improve performance, it is a security issue. Xen does not cache pages in order to provide more robust security. In my opinion, the security aspects of server virtualization are just starting to emerge. Virtual machines are touted as being more secure than running a bunch of applications together in a single OS. But the real security issues crop up when Virtual Machine isolation and containment are compared to the isolation of physical machines. Note that IBM has published a paper on the development of a security partition for Xen called sHype4. IBM is pointing to the future in that, if server virtualization is to be enterprise-ready, there needs to be more work in the security area. Otherwise, you might find that your virtual machines can have unforeseen paths for communication and ultimately compromise the desired security model of the enterprise.
In the SPECcpu2000 compilation tests, XenSource again points out that VMware did not use a released product, but used the vanilla Xen 3.0.3 release which has only partial support for hardware-assisted virtualization. VMware published benchmark results that showed VMware as performing at 90% of "bare metal", where Xen performed at about 68%. This is a very significant difference. XenSource published their results, where they used the device drivers that are part of their own release of Xen. This device driver is part of their XenSource Tools for Windows. In their benchmarks, the difference between VMware and Xen is not as significant, but VMware still performs a bit better - about 6% better.
The most dramatic departures from reality occur with the netperf benchmark. VMware shows VMware at 95% of "bare metal", where Xen performs at 3-6% of "bare metal". XenSource's benchmarks are very different. The graphical display of these results by the VMware benchmarking team is shown in Figure #3.
Again, this probably has to do with the fact that XenSource has developed paravirtualized device drivers that augment the standard Xen release. The XenSource team found that they only lagged behind ESX by 4-8%. The results are shown graphically in Figure # 4. Note that each team used maroon to represent their technology, so compare these graphs carefully.
It's at this moment that I feel they ALL need a reality check. I am disappointed that they would publish these results within the confines of Mb/s throughput. It's fair enough to show these results, but, behind the scenes, how much CPU power is being consumed? Our own benchmarks at Unisys have indicated that the CPU time per I/O can be between 4-6 times greater with virtualization than on the raw metal. So, you can sustain a near-bare metal throughput, but will run out of CPU cycles much earlier than expected. Neither of the benchmark "camps" commented on this.
Finally, there was one more interesting wrinkle that was illuminated with the benchmarks. For the SPECjbb2005 test, the XenSource authors noticed that bare metal results were actually WORSE than the virtualized results. In this benchmark, they created a virtual SMP consisting of two "logical CPUs". They compared that to using two physical CPU cores on the "bare metal" benchmark. When they dug into the phenomenon, they noticed that the bare metal benchmark used 2 CPU cores within the same socket. When they ran the virtualized benchmarks, the two logical CPUs used a core from each of the sockets. Obviously, multiple cores sharing a socket do not perform as well as independent sockets. Although there was no explanation, I can only assume that two data-intensive workloads, sharing the same socket, are each polluting a limited amount of CPU cache. So, file that in your mental Rolodex of issues to consider in the future.
The conclusion is that you cannot totally trust either camp. On the other hand, they are writing about what they consider to be the main performance issues and letting us know which benchmarks they feel are important. VMware would love to see Xen go away, but it probably will not. On the other hand, there is a new technical direction for Linux-based virtualization called "KVM". This is a subject for a future discussion. Interestingly enough, all three virtualization technologies rely on some form of a modified Linux base. Which brings up the specter of Microsoft's Viridian.... The fun just doesn't want to let up.
Well, until next month....
Feel free to send me e-mails with your comments and suggestions at:
And remember - I'm not a Real Doctor, I'm a Virtual one....
1. http://www.vmware.com/...
2. http://blogs.xensource.com/...
3. http://www.intel.com/technology/...
4. http://domino.research.ibm.com/...