Load Average Differences Between Solaris and Linux

March, 2008
by Adrian Cockcroft

About the Author
Adrian Cockcroft

Adrian is best known as the author of four books including Sun Performance and Tuning (2 editions); Resource Management; and Capacity Planning for Internet Services. In his 16 years at Sun he worked in technical sales and marketing, led creation of the BluePrints best practice publishing program, tested very complex integrated systems, was a leader of Sun's Six Sigma program and was the Chief Architect and Product Boss for Sun's High Performance Technical Computing business unit. In this time he gave many training classes and consulted with a wide range of customers, most notably as the on-site capacity planning consultant for the Salt Lake 2002 and Athens 2004 Olympic Games.

Joining eBay in 2004, he initially worked for Operations Architecture, investigating new platforms and providing guidance to the capacity planning groups at eBay and PayPal. As a founding member of eBay Research Labs in 2005, Adrian helped define the initial strategy for the Labs and an Innovation Forum. He researched operations related platforms and processes, lead research into advanced Skype plugin applications, contributed to development of the Skype4Java API and prototyped advanced wireless/mobile applications. During 2006 he published an IEEE paper on simulating large scale peer to peer networks, and a CMG paper on utilization measurement problems.

Adrian has consulted on architecture, scalability and performance for the Bebo.com social network, and is an advisory board member for Infovell and Holocosmos.

In 2007 Adrian joined Netflix as a Director of Web Engineering, directing a team responsible for research and development of scalable personalized web architectures.

Adrian filed two patents on capacity planning techniques while at Sun, and four patents related to peer to peer marketplaces while at eBay.

Adrian has a blog at http://perfcap.blogspot.com where he discusses capacity planning techniques, new computer technology, and how markets and innovation

interact. He is also a member of the Homebrew Mobile Phone Club, and several local classic car clubs.

A lot of people monitor their servers using load average as the primary metric. Tools such as Ganglia colorize all the nodes in a cluster view using load average. However there are a few things that aren't well understood about the calculation and how it varies between Solaris and Linux.

For a detailed explanation of the algorithm behind the metric, Neil Gunther has posted a series of articles that show how Load Average is a time-decayed metric that reports the number of active processes on the system with a one, five and fifteen minute decay period.

The source of the number of active processes can be seen in vmstat as the first few columns, and this is where Solaris and Linux differ. For example, some Linux vmstat from a busy file server is shown below.

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
4 43 0 32384 2993312 3157696 0 0 6662 3579 11345 7445 7 65 0 27

The first two columns show the number of processes that are in the run queue waiting for CPU time and in the blocked queue waiting for disk I/O to complete. These metrics are calculated in a similar manner in both Linux and Solaris, but the difference is that the load average calculation is fed by just the "r" column for Solaris, and by the "r" plus the "b" column for Linux. This means that a Linux based file server that has many disks could be running quite happily from a CPU perspective but show a large load average.

The logic behind the load average metric is that it should be a kind of proxy for responsiveness on a system. To get a more scalable measure of responsiveness, it is common to divide the load average by the number of CPUs in the system, since more CPUs will take jobs off the run queue faster. For disk intensive workloads on Linux, it may also make sense to divide the load average by the number of active disks, but this is an awkward calculation to make.

It would be best to take r/CPU count and b/active disk count then average this combination with a time decay and give it a new name, maybe the "slowed average" would be good?