|
Millennium Performance Problem 1: Performance Visualization
May, 2005
by Neil J. Gunther
Please check that you have your browser characters set to Western (mac) or
Western (windows) in order the see the mathematical notations correctly.
1 Introduction
Recently, I gave a presentation entitled "The Millennium Performance Problems" at the New England regional CMG meetings, and Rick Ralston (who was in the audience) asked me if I could expand on one of those problems for MeasureIT. Since I already had the slides, and Rick is such an enthusiastic Production Editor, I could not turn him down.
The Millennium Performance Problems (MPPs) refers to a keynote address I gave at the TeamQuest User Group meeting at CMG 2004 in Las Vegas, and several months later as a Webinar. Many of you may have been exposed to the entire presentation before. For those of you who haven't, my list of MPPs are:
- Performance Visualization
- Self-instrumenting Applications
- Breaking the von Neumann Bottleneck
- Performance Analysis of the Internet
- Performance Analysis of Quantum Computers
These are areas of performance analysis which I view as very difficult problems, that could have profound implications for performance analysis if we could solve them, and they are topics that currently seem to be receiving very little attention; especially from commercial performance tool vendors. I got the idea from the mathematicians, who were challenged in 1900 by something called The Hilbert Problems.
Most (but not all) of the Hilbert Problems were solved during the twentieth century and led to some very important new developments in mathematics as a consequence. One of the more unexpected consequences was the development of the modern computer itself (which I discussed in my MPP talk). This century has been kicked off with $1,000,000 prizes for solving problems of similar rank to the Hilbert Problems. Perhaps some wealthy corporations would like to throw a million dollars at the MPP problems(?) While we're waiting, I'll adopt Rick's suggestion to expand on MPP 1 in this article (hence, the title) and perhaps discuss the other MPP problems in later editions of MeasureIT.
The term performance visualization is not new (See e.g., [PAM 2000]). Similar terms and concepts have also been kicking around in the parallel processing and high performance computing community for a long time. See for example HPVC, Georgia Tech, ParaGraph, VROOM.
I'm sure you can google oodles more. A lot of the emphasis there, however, has been on creating tools to help identify where higher degrees of parallelism might be attained in scientific and engineering codes. Here, I am referring to the development and application of any type of visual aids for general purpose performance analysis and capacity planning; which might also include some of the ideas that have arisen out of the HPC community. Since nobody has paid me recently to work on this subject, what follows amounts to little more than a collection of random jottings about performance visualization that I've made over the years. Why should you care about my jottings?
One way you might find my ideas useful is when it comes to the purchase of new performance tools or you are about to renew the license fees on existing performance tools. That is a good time to talk to your vendor about what you'd like to see in future releases of their tools. It's at that time (and that time only) when a vendor is most receptive to ideas like those discussed here. You can make a difference for all of us because you have the power of the checkbook. That's why I'm fulfilling Rick's request. Even one of the ideas I discuss here may be of sufficient interest for you to press your favorite tool vendor to provide it in the future. In that way you help to make performance analysis and capacity planning more efficient and interesting for all of us.
2 The Difference is in the Differential
At the outset I should emphasize that there are many forms of visualization but achieving good visualization is actually a very deep problem (which is why I've listed it as an MPP). The main reason it is a difficult problem is that we know what we want but we're not sure how to get it. The goal is easy to state:
We seek the best impedance match between the digital computer we are measuring and the cognitive computer (i.e., our brain) which we use to interpret those measurements.
A good impedance match is one that lets our brain (and other brains) expend more energy on solving the performance problem than trying to decipher what the raw data (or a bad visualization) is trying to tell us. The central problem is that we don't know how our brain works, so we don't know how to guarantee a good impedance match; let alone the best impedance match. Thus, we tend to flounder around applying trial and error and subjective opinion to decide what is good or bad visualization. We know it when we see it (or think we do).
Actually, it's not quite that bad. Certainly we do not understand all the neural circuitry of the brain (which is a very novel kind of non-Von parallel computer. See MPP 3), but we do know quite a lot about pieces of the brain's neural circuitry and in particular the visual system [Crick 1994]. One dominant feature of the brain in general, and the visual cortex in particular, is that it is an excellent differential analyzer.
For example, you may not realize it, but your brain actually computes things like size and color. Size computations you already know something about. Now that the weather is improving, go outside some night with your camera and wait for moonrise. The moon looks huge near the horizon, but smaller at its zenith. Take a photo of the moon in the same angular positions and you'll see that its size remains invariant in the photograph. How can that be? The usual answer is that it is an optical illusion, but that doesn't explain why.
The "illusion" occurs because your brain mis-computes the size of the moon based on the size of objects within your visual field. Your brain does comparisons (or differential analysis) using the size of known objects near the horizon together with the knowledge that the moon is not nearby (you never touched the moon), and overestimates its size. Yes, your brain gets it wrong! At its zenith, however, there are no familiar objects to bias the sense of dimension, so your brain calculates the size of the moon correctly (i.e., consistent with the photograph).
The same thing applies to color. We don't see color, we compute it! Even if our eyes are receiving the same 650 nanometer red photons from a red flower, we don't see the same color red. The perceived color is determined by the color of the flowers surrounding the red flower we're looking at. The shade of red we see will be different if the surrounding flowers are blue rather than yellow. Moreover, things like contrast and edges are actually computed in the retina of your eye prior to being received by the visual cortex at the back of your head.
Wait a minute! I just said that when it comes to the size of the moon or the color of a flower, our cognitive computer may get it wrong! And what about performance analysis? That's a very serious subject that is supposed to be rational and accurate. How can we live with cognitive miscalculations? Well, you live with it every waking moment, and you seem to survive. The possibility of getting it "wrong" stems from our cognitive computer calculating differences rather than absolutes [Perception 1993]. Since we can't change that, such cognitive relativity is a very important aspect of perception to keep in mind when considering any visualization methods and tools (See Sect 5). Just because a particular visualization looks fancy, doesn't make it good. The brain is prone to illusion and error. So, when I said earlier that the brain is an excellent differential analyzer, I meant that it is much more efficient at relative computations than an electronic digital computer which operates on absolutes viz., numbers.
What we need are performance visualization cues that present relative values to our cognitive computer without error or illusion.
There is a role model for how we can best employ our cognitive computer: scientific visualization.
3 Scientific Visualization: A Role Model
In the same way the physicists invented the Web to solve the very real problem of exchanging data collected from "atom smashing" machines (and which you now use to present performance and capacity data to your manager), they also recognized that all those desktop MIPS and GUI software could be applied to visualizing complex physical systems like the development of a tornado.
Click picture to play movie.
In the days before the advent of scientific visualization, one was left to understand the complex physics of tornado formation by calculating horrible differential equations like the Navier Stokes equation. Without visual aids this equation generates a mess of almost meaningless numbers representing the convective currents due to thermal effects, the formation of the meso-cloud, the gradual increase in angular momentum, and so on. In other words, the cognitive computer has to work overtime to understand the implications of this morass of data. Today, it is done with decorated animation. The physicist looks at the shape evolution of the cloud rather than a morass of numbers. The big leagues version of all this is climate modeling (e.g., global warming) using computational tools like the Earth Simulator. But why should the physicists have all the fun? As a performance analyst:
I would like to have the option of viewing computer system performance as an evolving shape.
In case you think I'm nuts, here is somebody's attempt to produce a static shape for their I/O subsystem using EXCEL graphics.
One reason this would be interesting is because our brains are specifically tuned to efficiently interpreting shapes (See Sect. 2). Unfortunately, there is a deeper problem. All non-relativistic physics takes place in (3 + 1)-dimensions: three spatial dimensions and one time-dimension. On the other hand, almost all performance analysis takes place in N-dimensions, where N is the number of performance metrics and is therefore usually very large; on the order of hundreds or possibly thousands. Even if you use statistical techniques [GData 2005] to find the significant subset of these performance metrics, N is still likely to be much bigger than (3 +1). In this sense, the physicists have lucked out. We performance analysts are stuck with trying to represent a large number of performance metrics on a 2-dimensional computer screen?
4 Some Novel Visualizations
These N-dimensional performance metrics can be associated with degrees of freedom in the visual representation. The question becomes, how to project them onto a 2-dimensional surface (your computer screen). Naturally, when you try to squash many dimensions into just two, you can lose information. Nonetheless, some quite clever techniques have been devised. I'll just mention a few of them here.
4.1 Chernoff Faces
One approach makes use of another feature that is known about our brains. We have specialized circuits in our visual system (Sect. 2) just for recognizing human faces. This is a big help when you're at CMG National. You recognize people you know and want to talk to (or not, as the case may be) very quickly, and often from afar. Try getting your laptop computer to recognize people in photos you've taken with your digital camera. So much for high tech!
Anyway, we can process facial information very efficiently, and this observation has been incorporated in something called Chernoff faces like those shown here. Chernoff faces can be used for any data having a large number of degrees of freedom. Each parameter or variable is mapped to a facial attribute such as the slope of an eyebrow, a smile or frown, and so on. Although Chernoff faces are obviously applicable to performance data, I haven't seen them used in any performance analysis tools.
This visual representation has a good impedance match with our facial processing system and therefore helps us to spot visual anomalies very easily. One limitation, however, is that the semantics of the facial expressions is quite arbitrary and depends heavily on the context. This is not necessarily a bad thing, but you do need to have the context to know what the faces mean.
4.2 Parallel Coordinates
A completely different approach to compressing dimensions uses something called Parallel Coordinates due to [Inselberg 1985]. Consider a straight line y = m x + c with slope m = -3 and intercept c = 20, represented using standard orthogonal X-Y coordinates in Fig. 1.
Figure 1: A straight line represented in standard X,Y coordinates (left) and parallel X1, X2 coordinates (right).
In order to represent y = -3 x + 20 in parallel coordinates, we first treat the line as a set of points (Fig. 1). The parallel coordinates labeled X1 and X2 in Fig. 1 are usually spaced equidistant from each other at positions 0, 1, 2, ¼. We see that each point on the original line now become a separate line in parallel coordinates. These lines intersect one another at
which is (0.25, 5.0), in this case. Clearly, your usual geometric intuition goes out the window in parallel coordinates.
Figure 2: A 3-sphere represented in parallel coordinates.
Fig. 2 shows the points lying on a 3-dimensional sphere mapped onto parallel coordinates. As you can see, this rat's nest of lines has too much visual noise. This probably explains why parallel coordinates do not seem to have found widespread application (See also Sect 5).
4.3 Barycentric Coordinates
A desirable goal for a visual performance tool is that it not consume more screen real-estate than absolutely necessary. Another desirable goal is that the tool not be too visually demanding. If possible, it should use your peripheral vision rather than your direct vision. An example of such a visual tool uses barycentric or triangular coordinates [Gunther 1992].
The idea is based on the concept of a centroid (the center of gravity) of an equilateral triangle. The centroid is formed by drawing perpendicular bisectors from each side of an equilateral triangle (labeled ABC). The legs from the centroid to the sides all have equal length (shown in red). This much you learnt in high school. What you may not have been taught is that if you take the centroid and move it around inside the triangle, it is no longer the centroid but the sum of the red-leg lengths remains the same and is equal to the height of the triangle. If I make the height of the triangle equal to 100 units, then the sum of the red legs must also equal 100 units.
Now, if I can find a set of performance metrics that sum to 100 units (or 100%), I can always represent it as a point inside the triangle. Naturally, CPU utilization springs to mind. In UNIX, for example, CPU busy time is the sum of the system time and the user time. The remaining time in any sample period, when the CPU is inactive, is the idle time. The length of each of the three legs can be rendered in those same proportions and where they intersect inside the triangle corresponds to the 'busy-ness' of of the CPU. Since the state of a single CPU is represented by a single point, multiple CPUs can be represented by multiple points. As shown on the right, this can be done for a large number of CPUs such as would be found in any modern symmetric multiprocessor. Note that more CPUs does not require a larger triangle.
Moreover, this kind of display has been animated to show dynamic clustering effects. This clustering behavior tends to be repetitive or periodic for a given workload. In that sense, the performance analyst doesn't need to stare directly at the triangle. What is important is not expected motion, but sudden or erratic motion. Such visual cues [Keller 1993] are better detected by your peripheral vision rather than your direct vision. This approach makes use of yet another set of specialized cognitive circuitry that has evolved to keep us alive. If something (e.g., an animal or a vehicle) approaches us from the side, our peripheral vision detects it (without seeing clearly what it is) and causes us to flinch and prepare to take flight.
4.4 Treemaps
While I was giving the Millennium presentation, a number of people asked me about using Treemaps. Treemaps are advertised as a space-constrained visualization of hierarchies, which several people have recognized as possibly being useful for visually representing the state of a large server farm or an enterprise-wide collection of servers.
I was first exposed to the treemap concept in a CMG talk by Lin Merritt [Merritt 2004]. Of course, after hearing about it, I promptly forgot about it because nobody pays me to work on performance visualization. Lin pointed out that treemaps can be used to display a large number of servers or workloads as color-coded rectangles, where the size of each rectangle represents the capacity rating of the server and a traffic-light color (e.g., red, yellow, green) can represent its status or CPU utilization. Potentially, it can be used to group and display all the servers in your enterprise.
In reviewing treemaps for this article, I would say the newer regularized ("squarified", ugh!) version of treemaps has a better impedance match than the earlier arbitrary rectangle version used by Lin. Personally, I find the earlier treemaps a bit too visually noisy for my tastes. I'm left with the feeling that maybe I missed seeing an odd shaped rectangle or two. The more uniform the representation the more likely my peripheral vision will pick out discrepancies. Moreover, treemaps meet some of my criteria for performance visualization tools.
5 Visualization Criteria
In 1992 I wrote down a set of criteria which I felt should be met by any good performance visualization tool. The main attributes are:
- Localized. Consumes the smallest possible area of 2-d screen real estate.
- Dynamic. The ability to see temporal development e.g., use of animation.
- Multi-parameter. Allow multiple performance parameters to be viewed simultaneously.
- Patterns. Facilitates the easy recognition of patterns and clustering in the data.
- Cognition. Low cognitive overhead through the use of visual cues.
- Semantics. Universally understood semantics (semiotics?) for the visual representation.
- Generality. The ability to apply the same paradigm to different sets of performance parameters.
- Journaling. Enables recording and redisplay of filtered performance effects.
- Personalization. Perception is subjective so I need to customize the data visualization to suit my
cognitive computer.
I think most of these still hold today. Even if you believe there should be some additions, this could be a minimal set. Clearly, approaches like parallel coordinates (Sect. 4.2) fail to meet this minimal set. On the contrary, barycentric coordinates (Sect. 4.3) and treemaps (Sect. 4.4) fulfill more of them.
Personalization is very important because everyone's brain is wired slightly differently. For example, I'm color blind to certain shades of green. So, that color could be misleading for me in a performance tool. On the other hand it could be very helpful for someone else.
Finally, we need to distinguish clearly between what is stylistic and what is principle. A lot of people are fond of Edward Tufte's books and presentations. For me, however, Tufte is like the Julia Child of graphics: he can show you what good graphics looks like (e.g., Napoleon's advance and retreat on Moscow in 1812)
and he can tell you what bad graphical habits are fostered by PowerPoint, but that doesn't make you a chef. To become a chef you need to understand why recipes work, learn a little food chemistry, as well as the basic principles of cooking. In general, Tufte offers guidelines, not principles.
For my money, the mathematical statisticians have spent more time and been more successful in developing principles of operation for the visual presentation of data, and we as performance analysts could do worse than take note of their efforts. In that vein, I recommend to you the works of [Cleveland 1985], [Harris 1999], [Keller 1993], [Tukey 2003].
6 Epilogue
If you would like to participate in discussions about Performance Visualization in particular, or the Millennium Performance Problems in general, please visit my CMG web page where I will collect your comments and provide updates.
References
- [PAM 2000]
-
J. A. Brown, A. J. McGregor, and H-W. Braun,
"Network Performance Visualization: Insight Through Animation,"
Passive and Active Measurement Workshop,
Hamilton, New Zealand, April 3-4, 2000
- [Cleveland 1985]
-
W. Cleveland,
The Elements of Graphing Data,
Wadsworth, 1985
- [Crick 1994]
-
F. Crick,
The Astonishing Hypothesis,
Scribners, 1994
- [GData 2005]
-
N. Gunther and D. Lilja,
Guerrilla Data Analysis
class, March 2005
- [Gunther 1992]
-
N. Gunther,
"On the Application of Barycentric Coordinates to the Prompt and
Visually Efficient Display of Multiprocessor Performance Data,"
in Proceedings of Sixth International Conference on Modelling
Techniques and Tools for Computer Performance Evaluation,
eds. R. Pooley and J. Hillston,
67-80, September,
Edinburgh, Scotland,
Antony Rowe Ltd., Wiltshire, U.K., 1992
- [Harris 1999]
-
R. Harris,
Information Graphics: A Comprehensive Illustrated Reference,
Oxford, 1999
- [Inselberg 1985]
-
A. Inselberg,
"The plane with parallel coordinates,"
The Visual Computer, 1:69-91, 1985
- [Keller 1993]
-
P. Keller and M. Keller,
Visual Cues: Practical Data Visualization,
IEEE Press, 1993
- [Merritt 2004]
-
L. Merritt,
"Seeing the Forest and the Trees: Capacity Planning for a Large Number of Servers,"
Proc. CMG, Las Vegas, 2004
- [Perception 1993]
-
W. Hendee and P. Wells (Eds.),
The Perception of Visual Information,
springer-Verlag, 1993
- [Tukey 2003]
-
W. Cleveland (ed.),
The Collected Works of John W. Tukey: Vol. V, Graphics 1965-1985,
Wadsworth, 1988
Copyright © 2005 Performance Dynamics Company. All
Rights Reserved. Permission has been granted to CMG Inc., to publish
this version in the CMG MeasureIT online magazine.
Last Updated 06/05/09
Home |
Conference |
Groups |
National |
Members |
Links |
Site Map
|