by Neil J. Gunther
The main purpose of this article is to explain the animation in Figure 1, which is a certain kind of visualization of the industry specification known as the Application Performance Index proposed by the Apdex Alliance. The Apdex Index is a single number intended as a high-level indicator of application responsiveness. Though it is promoted by the Apdex Alliance as a single number, simple enough to be easily understood by executive managers, others of us see that there is more to it than meets the eye. As if to affirm this view, several authors at CMG 2008 [Aker08,DinA08,DinB08] raised questions about or otherwise tried to explain the Apdex Index. This is my attempt to do the same thing. Indeed, there are so many interesting aspects to the Apdex Index, that I have included a Table of Contents to help you to navigate them.
Figure 1: Animation of multiple Apdex response-time profiles for a web-based application measured from five different geographical locations (colored dots) over a period of 30 days (2 seconds == 1 day). The colored bands are defined in Table 1. See Example 1 and Section 4 for a complete explanation of this performance visualization. [Data supplied by the Apdex Alliance.]
Before going any further, I want to emphasize that the purpose of this article is not to improve upon or redesign the Apdex Index in any way. That's possibly a subject for another discussion. Here, I merely want to present and explain the Apdex Index as it is currently defined. Having understood what the Apdex Index is, I would then like to consider other ways that it might be applied; especially from a performance visualization standpoint.
As you may have guessed from Figure 1, I consider triangles to be very helpful for understanding the full significance of the Apdex Index [GunJ07]. If you just can't wait for the punch lines, in Section 3.2 I introduce an entirely new interpretation of the Apdex Index as an areal ratio. Based on this insight, I defend the Apdex Index against some common criticisms in Section 3. In Section 4, I discuss the construction of Figure 1 in detail and explain how it contains all the appropriate Apdex information in one visual place.
I'm a visual kinda guy [Gun92,Gun96,Gun05,Gun07,GunJ07,PodG08] and, together with my colleague Mario Jauvin, we started the PerfViz (Performance Visualization) Google group in response to the huge interest shown in our presentation at CMG 2007. In our CMG07 presentation, we discussed the application of 3- and 4-dimensional barycentric coordinate systems to the visualization of various kinds of performance data. In particular, we discussed:
In that 2007 paper, however, we only had space enough to merely touch on the application of barycentric coordinates to the Apdex Index. In part, I would like to redress that shortcoming here.
Since it was 2007, we couldn't resist the reference to James Bond and called our entire suite of barycentric visualizations Barry007: "Barry's the name, visualization's the game." Jokes aside, our general goal has been to promote more modern, and hopefully more useful, visualization techniques that could be adopted by tool vendors for the purpose of facilitating performance exploration rather than the standard performance reporting paradigm.
As we pointed out in our 2007 presentation, there is a huge amount of visualization activity going on elsewhere; especially in the physics (SciViz) and web analytics (InfoViz) communities. In the meantime, performance management tools lag further and further behind. At the end of the day, it's all data, so the lack of progress probably comes down to a lack of investment dollars. Neither Mario nor I are getting paid to do this, so we were only able to produce prototype displays (here implemented in Mathematica). Nonetheless, this can be an important step because one of the problems perceived by developers and vendors is a lack of market demand for new visualization tools. And there is no demand from the users of performance management tools because they don't realize what they're missing. Prototypes can help to break that Gordian knot.
In my view, the Apdex metric has a natural fit with the 3-dimensional barycentric representation (which we call Barry3). Getting the industry to adopt any new ideas is always difficult (especially during an economic recession), so I'd like to revisit the Barry3 representation of Apdex and explain how it works in more detail than time permitted in the original CMG07 presentation. Along the way, I'd also like to clarify some of the myths and misunderstandings surrounding the definition of the Apdex index.
Example 1 Web Application Figure 1 shows time-series data as an animation. The data are Apdex response-time profiles for a web application measured from 5 different geographical locations, viz., California (CA), Colorado (CO), Florida (FL), Minnesota (MN), New York (NY) (shown as colored dots), measured over a period of approximately one month (30 days). Daily measurements correspond to a particular constellation of colored dots inside the triangular region. The service level rating of any particular response-time profile can be seen immediately from which colored band a dot falls into. For example, the black dot (CA measurement location) falls into the red ("Poor") band on day 26. The three vertices correspond to the official Apdex categories: s (Satisfied), t (Tolerating) and f (Frustrated). Ideally, one would like all measurements to cluster inside the blue region at the apex of the triangle, indicating that application response times are predominantly of the Satisfied type.
Of course, if you didn't see our CMG presentations, you can be forgiven for wondering:
The answers to these questions, and more, will be revealed in the subsequent discussion.
In this section, I simply want to summarize the concepts that underpin the Apdex Index, as they are currently defined in the Apdex Alliance technical specification. I purposely don't want to get distracted by any commentary on how their definitions might be improved. That's a different topic, which others have considered. See. e.g., Section 5.1. Here, I am merely going to take the Apdex definitions as read and apply them to some actual response-time data as they unfold.
These data come from an injection benchmark that I created for a corporate client, some years ago. The reason for measuring their application (called 'ATTDISP' in Fig. 2) was to quantify its rumored poor response times. Naturally, these times needed to be quantified before any decisions could be made about how to improve its performance. At that time, the Apdex Index had not been invented, so it's interesting to determine its numerical value now to see what it might have told us back then.
The data we shall consider are shown in Figure 2 because they are representative of a typical response-time distribution.
Figure 2: Histogram of response times measured on a local application (called ATTDISP). The lower response-time threshold is set to T = 6 seconds (left dashed line) which automatically puts the upper threshold from Definition 1 at F = 4T = 24 seconds (right dashed line).
Example 2 Local Application The histogram in Figure 2 represents 728 response-time samples grouped into 21 × 3-second bins (columns). The sample mean response time is 17.529 seconds and the peak frequency (sample mode) is 291 events in the range 12-15 seconds (5th bin). The maximum response-time sample is 1 event in the range 60-63 seconds.
The data in Example 2 are local measurements performed on a large cluster of servers running many application instances. This is in contrast to the data in Figure 1, which are remote measurements obtained from monitoring services like Keynote.com or Gomez.com. The Apdex Index is applicable to both remote and local measurements.
Now, let's examine these data from the standpoint of the Apdex Index; something that was not done during the original consulting gig.
All of the definitions and metrics summarized in this section can be found in the Apdex Alliance technical specification. Unfortunately, that document tends to be rather verbose about what are otherwise quite elementary concepts from a performance analysis standpoint. Therefore, I shall apply a little bit of mathematical notation to condense the important concepts for our subsequent discussion.
The Apdex spec defines two threshold times:
Definition 1 [
These times are typically expressed in seconds, but other time units could be used, depending on the application. We'll see shortly that the numerical value of the Apdex index (AT) is determined implicitly by the choice of the threshold T; the reason for writing T as a subscript. This is as it should be, because the choice of T is logically equivalent to setting the SLO (service level objective) for the application response time.
The choice of 4T for F is purely arbitrary, but is based loosely on empirical CHI (Computer-Human Interface) studies originating at Xerox PARC [CaRM91]. It has nothing to do with any particular response time distributions. Rather, it tries to capture the psychology of user-perceived responsiveness. Put simply, we know that users who wait more than a few seconds tend to perceive their own waiting time as being about 3 or 4 times longer than the actual measured times. I believe that is the intent behind the Apdex Alliance's choice of F.
Example 3 In Figure 2, I have arbitrarily chosen T = 6 seconds (left vertical line). This was not done at the time of the original performance analysis because nobody had tried to quantify "good" performance; they just knew that it "sucked!" If the Apdex Index had existed at that time, it would have forced this quantification to occur much earlier. Anyhow, T = 6 is not an unreasonable choice; try saying "1 cat-and-dog, 2 cat-and-dog, ..." six times while doing nothing else. Choosing T = 6, in this case, automatically sets F = 24 seconds as the other Apdex boundary (right vertical line) in Figure 2.
Each measured response-time Ri, i = 1, 2, 3, ... is sorted into one of three statistical categories or Apdex Performance Zones (Figure 3) according to how it compares with the two thresholds in Definition 1.
Definition 2 [
The labels chosen for these categories are unimportant. You could just as easily call them Good, Bad and Ugly.
The next important question is, how many response-time samples do we need to compute a meaningful value of the Apdex index? The Apdex spec tells us we need at least 100 samples. That's good because I have 728 samples. After sorting my samples into the corresponding Apdex Zones, I count how many samples reside in each Zone. I will notate these counts or events as follows:
Finally, the Apdex Index (the single number) can now be defined using this notation.
The Apdex Alliance has developed a simple rating scale for AT that provides a quick assessment of application performance. The ratings, along with Apdex values that define them are shown in Table 1.
|Index Range||Apdex Rating||Symbol||Color|
|0.94 to 1.00||Excellent||E|
|0.85 to 0.93||Good||G|
|0.70 to 0.84||Fair||F|
|0.50 to 0.69||Poor||P|
|0.00 to 0.49||Unacceptable||U|
According to the Apdex spec, the choice of colors is optional and can be vendor-specific. For example, one might prefer "traffic light" colors with green for E, amber for G, and so on.
Having applied the Apdex definitions and procedures to our data in Example 2, we find that the vernacular "sucks" is transformed into the more decorous Unacceptable, in Apdex parlance (Table 1). That's encouraging because it's also consistent with our expectations, as well as those of the original application users. It would be disturbing if we had ended up with an Apdex Index in the Good or Excellent range.
An Index of A6 = 0.47 also suggests what needs to be done to improve performance: shift the histogram in Figure 2 to the left. Easy to say, not necessarily easy to do, but it does provide a quantitative objective for application programmers to strive towards.
Remark 1 Of course, the numerical value of AT depends on the chosen threshold T in Definition 1. It is equivalent to setting the SLO. The size of T determines the size of the Satisfied bucket and therefore how many Cs counts it contains. Similarly, F = 4T determines the size of the Tolerating bucket and therefore how many Ct counts it contains.
Of course, we could move the goal-posts, as it were.
Example 6 We would get a higher Index of A12 = 0.56 by increasing the threshold to T = 12. The improved Index results from capturing more events in the Satisfied bucket, with a corresponding reduction in the number of Tolerating and Frustrated counts. But that's like saying: "1 cat-and-dog, 2 cat-and-dog, ..." twelve times now while doing nothing else, and that is not likely to be a generally agreed upon measure of Satisfied.
Example 7 Conversely, setting the T-threshold to a "faster" value, e.g., T = 3 seconds, would result in an exceptionally low Index: A3 = 0.06. In all likelihood, the application would be regarded as pathologically broken, and that view might have unintended consequences.
So, T = 6 seems to be a reasonable choice based on the performance of the current application and it generates a meaningful Apdex Index of A6 = 0.47. We could optionally reduce the T-threshold later on, if response-time performance is improved in future releases.
The steps shown in Figure 4 for calculating the Apdex Index are:
This completes our review of the essential definitions contained in the Apdex Alliance technical specification. We now turn to an examination of some perceived limitations in these definitions.
So there you have it. Measure the application response times and follow the steps in Figure 4. Simple! Certainly simple enough for executive presentations. But for those of who know how difficult it is to construct benchmarks, metrics and performance models, there are a number of unexplained points which cry out for further clarification. Some remarks I have heard include:
Unfortunately, reading the Apdex spec offers little enlightenment concerning these points, but sensible answers can be given. Let's examine each of them more carefully.
As I already mentioned in the Introduction (Section 1), the Apdex Index is very deliberately defined as a single number, intended as a simple indicator of application responsiveness. The Apdex Alliance wanted it to be understood easily by managers who don't otherwise have time to contemplate details.
The more technical amongst us have a tendency to jump all over the idea of presenting performance as a single number. This is a bit unfair when you consider that we do live with such performance encapsulations in other domains, e.g., the SPEC CPU2006 ratings are all single numbers. Not only that, but vendors especially tend to quote the peak SPEC numbers (the maximally optimized values) without telling anyone or referencing the more realistic SPEC base numbers. However, in the case of SPEC, the opportunity always exists to visit the SPEC web site and peruse the individual execution times for each benchmark code on a given processor.
For my part, I don't have any problem with the Apdex index being a single number, as long as I can resolve it into its component values without doing a lot of work, when a diagnostic evaluation is required. As you will see in Section 4, and Remark 9 in particular, a Barry3 representation (Figures 1 and 10) addresses this issue by construction.
Take a look back at eqn.(1) and note the appearance of 1/2 with the Ct term in the numerator. Where did that come from? Who ordered that!?
This question has plagued me since I first became aware of the Apdex Index. Moreover, the technical spec is silent on this point. In the meantime, I've discussed it with several people, including members of the Apdex Alliance, and no one has been able to provide a satisfactory explanation for why 1/2 and not 1, ... or 1/3 ... or 1/e ... or Euler's constant. Apparently, I am not alone. Several presentations at CMG 2008 [Aker08,DinA08,DinB08] considered this question more deeply than anything I have seen previously. The main thrust of those presentations is summarized briefly in Section 5. In response to those efforts, I believe I can now provide a different, and strikingly simple answer, and guess what? It involves triangles ... Tolerating triangles.
Figure 5: Apdex Performance Index represented as an areal ratio. The dark area corresponds to the numerator in eqn.(1) while the denominator corresponds to the total area of all the columns.
Consider the artificial histogram in Figure 5. Each column has width T seconds. The respective height of each column: 15, 25, 25, 25, 10, corresponds to the sample response-time counts. For simplicity, I've arbitrarily chosen the counts so that they add up to 100. In reality, the value of T determines the actual counts I would see in each column. See Remark 1. For the purpose of this part of the discussion, we can ignore that point. In other words, whatever the value of T happens to be, the counts are those shown in Figure 5.
The first column corresponds to the Satisfied Zone, the next three columns taken together correspond to the Tolerating Zone between T and 4T, and the last column corresponds to the Frustrated Zone. Like Figure 3, the Apdex Alliance tends to show this as a bucket of arbitrary width, but what counts is the counts or events in that bucket. Therefore, I can simply replace it with a bucket of width T and arbitrary height. In practice, the actual count will be some finite number due to the measurement process.
The total area of all five columns is 100T event-seconds. The total dark area is the sum of the first column (15T) plus the area of the dark triangle. The area of the triangle is half the area of the 3 middle columns, viz., (1/2) (25 + 25 + 25)T event-seconds. So, the ratio of the dark area to total area is given by:
where the T's have cancelled each other. But this is identical to the result I get by just applying eqn.(1):
without any regard to areas or triangles.
This Tolerating triangle does provide an intriguing interpretation of AT. The areal ratio in eqn.(3) says the following. Ideally you'd like all your samples to fall into the first (Satisfied) bucket, but failing that, you'd like the remainder of the samples in the second (Tolerating) bucket to be monotonically decreasing toward the third (Frustrated) bucket. That's exactly what we see in Figure 2. Since the majority of those data lie in the triangular area of the Tolerating bucket, we get a rather low value for the areal ratio and therefore AT, also. See the discussion in Section 3.8 for another example.
The 1/2 factor is now easily understood as coming from the Tolerating triangle in Figure 5. Even if you don't like the Apdex Index definition or think that it can be improved upon, this is what the current definition represents. An entirely different explanation of the 1/2 factor in terms of 3D-triangles is provided in Section 4.2.
Remark 2 The areal shape of the numerator in AT is determined by the choice of T, as we pointed out in Remark 1. Since the value of T determines the counts in each of the 3 Apdex Zones/buckets, it also determines the shape of the dark area in Figure 5. See also Theorem 1.
Remark 3 This triangular area in Figure 5 does not conform to exponential response-time distribution associated with an M/M/1 queueing model [DinB08]. Therefore, any association with an M/M/1 model is likely to be purely coincidental. See Section 5.2 for details.
In my view, this statement is misleading. If anything, the Apdex Zones in Definition 2 might be regarded as a poor man's distribution because they sample the empirical distribution of response-time measurements with only 3 buckets: Satisfied, Tolerating and Frustrated. Compare this with the discussion in Section 5.1.
The purpose of the Apdex Index is really to provide a simple figure of merit, not to render the actual response time distribution. In that sense, 3 buckets are a reasonable choice. Moreover, from Section 3.2 we now know that the Apdex Index corresponds to a figure of merit based on areal ratios, not the actual distribution. In this sense, computing the Apdex Index is akin to a Monte Carlo calculation.
As stated in Section 3.3, the Apdex Index is a figure of merit, not a method for characterizing all possible response time distributions. We have statistical moments for that. The purpose of AT is to express how well a given set of response time samples conform to the numerator in the areal ratio (see Section 3.2). By definition, a heavy-tailed distribution will be non-conforming due to a preponderance of Frustrated counts, and thereby produce a low (Unacceptable?) value of AT.
And this is as it should be. At the application level, heavy-tailed distributions are generally to be regarded as undesirable. An attendant low Apdex Index requires that its performance be remedied, not characterized with a higher precision AT value. What might occur at the packet-level with respect to so-called self-similar traffic [GCaP07] lies outside the implied scope of the Apdex definitions (Section 2).
The same Apdex value can belong to different combinations of Cs, Ct, and Cf. Put differently, the value of AT can be ambiguous with respect to Zone counts of which it is comprised. Some special cases are shown in Figure 6.
Figure 6: Examples of various areal ratios. In some case: (B) and (C) as well as (E), (F) and (G), the same Apdex Index obtains for different areal configurations due to different counts Cs, Ct, and Cf.
Figure 6(A) corresponds to AT = 1.00 because there are no sample counts outside the Satisfied bucket.
Figure 6(B) corresponds to AT = 0.99 because, as well as most sample counts being in the Satisfied bucket, there is a single event in the Tolerating bucket. Similarly with Figure 6(C), there is a single event in the Frustrated bucket and none in the Tolerating bucket. Hence, we get the same Apdex Index value with different areal configurations. Notice that the areal ratios are slightly different when expressed as rational numbers but look the same when expressed as a decimal fraction approximation, and that's the number reported by the Apdex Alliance.
If we add a Frustrated column of height equal to the Satisfied column in Figure 6(D), the result is Figure 6(E) and the Apdex Index falls to AT = 0.50. Visually, this is very clear because the dark and light areas are equivalent, therefore the dark area must be half the total area. Put another way, the dark and light regions are symmetric about the central x-axis. Similarly, it is clear that Figures 6(F) and (G) are also symmetric and thus AT = 0.50 also.
Finally, in Figure 6(H), the Apdex Index falls to AT = 0.25 as counts are removed from the Satisfied bucket and added to the Frustrated bucket.
In a typical commercial enterprise, one may have to manage hundreds or perhaps thousands of servers running multiple applications per server. Even if viewing all of them simultaneously is unlikely, it may be necessary to look at a large subset of them. How do you digest hundreds of Apdex Indexes? How do you digest changes across hundreds of Apdex Indexes? Other than simple tabulation, the Apdex Alliance does not seem to have addressed this issue.
Figure 7: A list of 30 applications (1 per row) using the Apdex Rating colors in Table 1. The length of an arrow represents the Apdex Index for an application. A list of 100 applications would be more than 3 times as long.
One possibility is shown in Figure 7 where the 30 rows correspond to 30 applications encoded using the Apdex Rating colors in Table 1. The length of the black arrows indicate the Apdex Index for each application. A list of 100 applications would be more than 3 times as long. The downside for performance visualization is that the size of the viewing area grows linearly with the number of entries [GunJ07]. However, the viewing area of the barycentric representation, discussed in Section 4, remains essentially invariant with the number of entries.
Although the Apdex Index is intended for easy comprehension by management, any manager seeing a Poor or Unacceptable Apdex value is bound to ask, "Why?" Such questions beg for some kind of diagnostic information, which is not available in a single number. Dick Hamming is often quoted as saying: "The purpose of computing is insight, not numbers." I would add that sometimes the numbers contain the insight, so they had better be within easy reach.
It turns out that the first level diagnostic numbers are just underneath the Apdex Index number by virtue of the way the Index is defined (see Definitions 3 and 4). As we shall discuss in Section 4, the barycentric coordinate representation brings all the numbers together in one place and thereby overcomes this objection.
The Apdex Index, 0 < AT ≤ 1, is best understood to be a figure of merit for application response times. It has a simple visual interpretation as the ratio of areas defined by eqn.(3). The Tolerating Zone is assumed to conform to a triangular area and this explains the appearance of the 1/2 in the Apdex Alliance definition of AT given by eqn.(1).
Theorem 1 [Gunther 2009] The Apdex Index AT is an adaptive figure of merit that compares the total number of response-time samples with the number of samples associated with an areal profile which is defined by the choice of threshold T.
Proof 1 See Figure 5 and related discussion.
In theorem 1, adaptive means that the area corresponding to the numerator in AT is not fixed. It is determined by the choice of T which, in turn, determines the the Satisfied area (counts in that column), as well as the shape of the Tolerating triangle. Let's check this conclusion one more time.
Figure 8: Areal representation of the data in Fig. 2.
Returning to the response time data used in Section 2, the corresponding areal ratio is shown in Figure 8. Note that the triangle is slightly offset from the left to better match the data in Figure 2. Visually, this asymmetry makes it easier to see the match with the original data, but as far as calculating the areal ratio is concerned it's irrelevant; only the area matters, not its shape. The corresponding figure of merit is given by:
which indeed is identical to the value of AT = 0.46978 calculated with eqn.(2). Clearly, the preponderance of the dark area lies in the Tolerating triangle, which produces an Apdex Index with an Unacceptable rating (see Table 1).
This is a kind of Monte Carlo calculation. Keeping in mind that the goal of the Apdex Alliance is to keep the Apdex Index simple, gussying it up with a more refined characterization of the areal ratio (such as proposed in [Aker08]) is likely to be more confusing than productive.
In the next section, we try to bring all these aspects together in a single visual representation using barycentric coordinates.
Whenever a quantity can be expressed in terms of 3 related parts, it is a candidate for being represented in 2-dimensional barycentric coordinates; a kind of triangular coordinate system shown in Table 2(b).
|(a) Cartesian coordinates||(b) Barycentric coordinates|
Table 2 shows a comparison between this triangular coordinate system and the more usual cartesian coordinate system that is used for performance analysis plots, e.g., histograms. Since we are always referring to the positive quadrant, the cartesian (x,y) location away from the origin in (a) can be expressed using 2 directions: north and east. Similarly, the barycentric location in (b) can be expressed using 3 directions: north, southeast and southwest. Notice how the 3 arrows point to each of 3 vertices in the (equilateral) triangle. This is how we squash 3 performance metrics into 2 dimensions, e.g., the barycentric representation of CPU utilization comprised of user, system and idle time. Instead of saying barycentric coordinate system with 3 axes every time, we have adopted the whimsical name Barry3 [Gun92,GunJ07].
The Apdex Alliance defines defines 3 buckets and the associated counts in Definition 3. The whole discussion of the Apdex Index can be simplified by redefining these 3 types of sample counts as normalized counts, which we shall label as s (normalized Satisfaction), t (normalized Tolerating), and f (normalized Frustration).
The earlier definition of AT in eqn.(1) can be replaced by the following simple expression.
Note the 1/2 is still there. Naturally, the total of these normalized counts must add up to 100% of all the counts. In other words:
Substituting the notation in eqn.(5), we can rewrite this equation more simply.
No matter what the actual sample Apdex Counts may be numerically, when they are normalized by the total sample size, they must sum up to one.
Remark 5 You may be asking yourself where the normalized f-metric went in eqn.(6). It's there alright. Just like the standard definition 4, it's sitting in the denominator. But since the denominator is just the number `1' (from definition 7), it's implied.
When we have 3 numbers that are constrained in this way, they can be represented using a barycentric coordinate system. The words barycentric comes from the Greek and means "center of mass" coordinates. Once again, the fact that there are 3 numbers means that the barycentric coordinates are associated with a triangle. Rather than saying this mouthful all the time, we prefer to give it a whimsical name: Barry3 [GunJ07,GunJ07].
In Table 3, Figure (a) shows the center of mass position at the center of the triangle. Technically, this point is known as the centroid. You can think of the centroid as the point where the triangle would balance if it was a uniform metal plate with equal weights attached to each vertex at s, t and f.
|(a) Centroid: AT = 0.50||(b) CA profile on Day 26: A4 = 0.575472||(c) ATTDISP profile: A6 = 0.46978|
Furthermore, notice that all the triangles in Table 3 are unit height (as indicated on the y-axis). Let's associate the triangle height with the sum in eqn.(7). In Figure (a) each of the blue arrows has the same length. They are one-third the maximum length than any arrow can have. The maximum length that any arrow can have, e.g., the north-pointing b1-arrow, is the height of the triangle. Therefore, if we place each of the centroid arrows end to end, their total length will also be equal to the height of the triangle.
Example 8 Refer to Figure (a) For the centroid shown in Figure (a) of Table 3, we have Cs = 100/3, Ct = 100/3, Cf = 100/3. From eqn.(5), the normalized values are: s = 0.3333, t = 0.3333, f = 0.3333 (to 4 significant digits). Notice that these numbers are the same as the barycentric coordinates: b1 = 0.33, b2 = 0.33, b3 = 0.33 (to 2 significant digits), shown as the blue arrows in the diagram. The value of the corresponding Apdex Index is AT = (1/3)(1 + 1/2) = 0.50 (to 2 significant digits).
Remark 6 The 3 blue arrows in Figure (a) can be regarded as representing a `Y' shape. Notice that the tails of the two arrows in the top part of the `Y' intersect the left and right sides of the triangle at exactly 1/2. The vertical arrow (the stem of the `Y') does not intersect at 1/2 the base length because an equilateral triangle of unit height has a base of length 2/√3.
Remark 7 If we project the blue arrow that points south-west past the centroid, it will intersect the lower left vertex labeled t. This projection is shown as a red dashed line. Since Figure (a) is an equilateral triangle, the dashed line bisects (cuts in two) the angle at t because it bisects the opposite side. As we explain in Section 4.2, this particular bisector corresponds to the Apdex Index value: AT = 1/2 = 0.5.
If the blue dot, currently located at the centroid, were to move diagonally along the red dashed-line, the value of the Apdex Index would remain AT = 0.5. Of course, the corresponding barycentric coordinates (lengths of the blue arrows) would be different in each position along that line. This is the Barry3 visualization of the Apdex Index multiplicity discussed earlier in Section 3.5; same Apdex Index, different position inside the Barry3 triangle.
Probably, you already knew about the centroid from high-school math. If not, it should now be visually obvious from diagram (a). What may not be so obvious is that if we move the dot away from the centroid (in any direction), the combined length of the blue arrows still adds up to the triangle height (unit height in this case). Pursuant to Remark 7, if the blue dot were to be repositioned down at the t-vertex (lower left), then the lengths of the b1 and b3 arrows would be reduced to zero while the Tolerating arrow (b2) would become unit length (the height of the triangle). In terms of a center-of-mass description, the new balance point corresponds to the correct position if the weight at the t-vertex was increased dramatically relative to the other 2 vertices.
Example 9 Refer to Figure (b) This figure shows one data point from Example 1; the Apdex response-time profile for the remote California (CA) location on day 26. With T = 4: Cs = 89, Ct = 66, Cf = 57. Applying eqn.(5), the normalized values are: s = 0.419811, t = 0.311321, f = 0.268868. The barycentric coordinates are: b1 = 0.42, b2 = 0.31, b3 = 0.27 (to 2 significant digits). The corresponding Apdex Index is A4 = 0.419811 + (1/2) 0.311321 = 0.58 (to 2 significant digits). The dot is `north` and slightly `west' of the centroid; a result of the b1 and b2 arrows being bigger in length than the b3 arrow. That particular California response-time profile is more Satisfied than it would be if it were located at the centroid in Figure (a).
The dashed red line running diagonally through the centroid in each the triangles corresponds to the Apdex Index value AT = 0.50, viz., the bottom of the Poor band in Figure 1.
Example 10 Refer to Figure (c) Figure (c) of Table 3 is the Apdex response-time profile for the locally measured application (ATTDISP) in Figure 2. From Example 4 with T = 6: Cs = 3, Ct = 678, Cf = 47. From eqn.(5), the normalized values are: s = 0.00412088, t = 0.931319, f = 0.0645604 (same as the blue arrows). The corresponding Apdex Index is A6 = 0.00412088 + (1/2) 0.931319 = 0.47 (to 2 significant digits). Consequently, the dot lies below the dashed line corresponding to AT = 0.50. The reason that the dot is pushed to the extreme lower left of the triangle comes from the fact that the b2 arrow is the largest of the 3 arrows, by far. That, in turn, is a result of the majority of the response-time distribution residing in the normalized Tolerating t-bucket in Figure 2.
The virtue of the Barry3 coordinates (arrows) is that I can see immediately how the s,t,f profile is comprised from the location of the dot inside the triangle. The only thing missing is the direct representation of the Apdex Index (the single number) itself. We address that in the next section.
Consider the Barry3 triangle located in a 3-dimensional xyz-Cartesian coordinate system; the "wire boxes" depicted in the top row of Figure 9. If the x−y plane were your computer screen, the z-axis would be pointing out of the screen, at you. Further suppose we set the base of the Barry3 triangle to be at zero height on the z-axis and set the triangle apex at unit height z = 1. The result is shown as the leftmost diagram in the top row of Figure 9. The mesh on the triangle is just to help it stand out and has no other significance. The height on the z-axis has no physical meaning. It simply represents a kind of weighting scheme in the sense that z = 1 is to be interpreted as being "better" or "cooler" than z = 0.
Figure 9: Contour plots of three alternative definitions of AT in the context of a Barry3 triangle. AT = 1 × s + 0 × t + 0 × f (left), AT = 1 × s + 1 × t + 0 × f (middle), and AT = 1 × s + (1/2) × t + 0 × f (right). Only the rightmost weights correspond to the bands in Fig. 1. Since no Apdex Ratings are being imposed here, the color scheme is consistent with Fig. 1 but not identical to it.
Next, I can ask Mathematica to display the gradient of this sloping triangle by a set of contour lines similar to those one finds in a topo map or a weather map. Moreover, we can color-code them with red at the lowest ("hot") end and blue at the highest ("cool") end. The result is the set of horizontal bands shown in the lower-left diagram of Figure 9.
Remarkably, we can also associate these bands with the barycentric Apdex coordinates s, t, and f from Definition 6. The horizontal bands correspond to lines of constant-s. Like contours of constant barometric pressure or isobars on a weather map, these parallel geometrical lines are called isoclines. Compared with eqn.(1), this particular choice of weights is equivalent to an alternative definition of the Apdex Index:
Although this is a plausible figure of merit, it is not a very satisfactory one because it simply says "maximize your s-value" without any regard to the associated t and f values. A Barry3 dot can be located on any horizontal line and be either to the right (more Frustrated) or left (more Tolerating) of the triangle and your AT(s) value would be the same. Expressed in geographical terms, optimizing AT(s) simply requires heading "due north" from your current location, even if you happen to be on the "eastern" side of the Barry3 triangle, i.e., more Frustrated than Tolerating response-time counts.
To correct this shortcoming, we could add in the Tolerating count to bias the figure of merit in the Tolerating direction. Now, the definition of the Apdex Index becomes:
From a visual point of view, this choice of weights has the effect shown in the top-middle diagram in Figure 9. In addition to the Barry3 apex being lifted to z = 1, it also lifts the lower-left Barry3 vertex to z = 1. The corresponding contour plot is shown underneath it. We see that the isoclines become parallel to the left edge of the triangle and they correspond to lines of constant-f. A dot that moves on any of these isoclines, will have the same f-value, i.e., the length of the blue arrow pointing south-east in Table 3, will remain constant.
Although this is also a plausible figure of merit, it is not satisfactory either. It is simply tantamount to saying "minimize your f-value" without any regard to the associated s and t values. Expressed in geographical terms and referring to the lower-middle diagram in Figure 9, it says: head "west-north-west" toward the blue band to optimize your AT(s, t) score. Getting into the blue band is a necessary goal but not a sufficient goal in this case. For example, the dot could be in the blue band but also be near bottom of the Barry3 triangle, which corresponds to a low s-value.
Clearly, something intermediate is needed between these two cases and that is shown in the third column of Figure 9. The left vertex of the Barry3 triangle is lowered to the half-way point z = 1/2. Voila! As if by magic, the isoclines become diagonal and intermediate between the other two cases. The directive now is to take a "north-west" passage to reach the blue band inside the Barry3 triangle. Of course, this choice of weights corresponds exactly to the definition of the normalized Apdex Index in eqn.(6):
A point on any of these isoclines will have the same AT value, but the component s, t and f values will be different from one another.
Remark 8 The choice of z = 1/2 is, of course, entirely arbitrary, but it gets visual support in 2 ways:
The colored contours bands of Section 4.2 look very similar to the Apdex Rating bands seen in Figure 1, but they are not identical because I simply let Mathematica arbitrarily color the contours of the sloping triangle in xyz without any reference to the official designations in Table 1. Making the results of this Sections 4.1 and 4.2 consistent, produces the Barry3 representation of Example 1 seen in Figure 10. The animated version for all 30 days of time series data is shown in Figure 1.
Figure 10: Static Barry3 representation of the combined Apdex data from Example 1 for day 26. Each triangle vertex is labeled by the corresponding normalized Apdex Counts: s, t and f (Definition 6). The Apdex profile measured from each of the 5 cities is shown as 5 colored dots corresponding to their respective s, t and f values. The Apdex Ratings (Definition 5) appear as colored diagonal bands and the Apdex Index (Definition 7) ranges are shown on the right edge of the Barry3 triangle.
It is important to keep in mind the following distinction between the representations of Apdex Zones and the Apdex Indexes in Barry3.
Apdex Counts in Barry3:
The barycentric coordinates (b1, b2, b3) in Table 3 are identical to the normalized Apdex Counts: (s, t, f). Their respective numerical values determine exactly where the point representing a given application is positioned inside the Barry3 triangle. For example, the black dot in Figure 10 represents the response-time profile of the web-based application measured from the California (CA) location on day 26.
Apdex Index in Barry3:
The Apdex Index, however, is defined independently of these barycentric coordinates. We discussed three possible definitions of AT in Section 4.2, viz., eqns.(8), (9) and (10). The official Apdex Index is given by eqn.(10) and its possible numerical values are shown in the right-hand edge of the Barry3 triangle in Figure 10. For example, the black dot representing the California (CA) profile lies almost midway inside the red ("Poor") band: 0.50 < A4 < 0.70, suggesting an approximate value of A4 ≈ 0.60. The exact value, from Example 9, is A4 = 0.58.
Remark 9 [Multiplicity] Recalling the discussion in Section 3.1, Figure 10 provides more insight into the limitations of reporting a single number. Imagine that the black CA dot in Figure 10 was actually sitting down on the lower edge of the red ("Poor") band. That edge also corresponds to the dashed red isocline in Table 3. Reading off from the right-hand side of the Barry3 triangle, we see that the corresponding Apdex Index would be AT = 0.5. That same Apdex value would also apply to the CA dot if it was on that same isocline but positioned at the lower-left Barry3 vertex. There: s = 0, t = 1 and from Definition 7, AT = 0 + 1/2. Similarly, if the CA dot was on that same isocline but positioned at the right-hand edge of the Barry3 triangle: s = 1/2, t = 0 and AT = 1/2 + 0. Moreover, there are an infinite number of positions in between where AT = 0.5 on that same isocline. Just knowing that AT = 0.5, doesn't tell you where on that isocline the dot is located. On the other hand, if you're looking at the Barry3 representation, both the Apdex value and its composition are immediately clear. You also have the option of paying attention to only one or the other aspect.
As with all visual representations, Barry3 has both strengths and weaknesses. No one size fits all and this is especially true when it comes to performance visualization tools. For example, many males are color-blind to certain shades of the color green.
The ability to view the dots of moving inside Figure 1 is very useful to gain an understanding of trends, transients or the periodicity of the data under observation. One sees, even in Figure10, that the city dots can sometimes overlap one another. A real performance management tool would need to allow a user to disambiguate such visual overlaps.
There are also many possible extensions of the Barry3 approach described here. Without going too far afield, I'll just hint at one possibility. Rather than displaying Apdex time-series data as a 2-D animation, a user might be prefer to view it as an explicit time-series in 3-D, still using the barycentric coordinates.
Figure 11: Toblerone view of Barry3 data for 3 Apdex Profiles shown as red, green and blue trajectories moving left to right inside the transparent triangular prism. The length of the prism represents the time-window of interest.
Now, each 2-D Barry3 frame (like those in Figure 1) is joined like the segments of a Toblerone chocolate bar or, more technically, a triangular prism. Each of the dots could be connected by a line segment between the frames to create a set of visible trajectories within the transparent volume of the prism. (Figure 11) The benefit would be that the user could interactively swivel the transparent prism to see the complete history of each Apdex profile during some time-window of interest.
All the examples shown here are prototypes developed in Mathematica. Any performance management tools based on these ideas would need certain enhancements to allow a user to customize the interface to provide them with the best cognitive impedance match [GunJ07].
Normalizing the three Apdex buckets according to eqn.(5) simplifies the expression for the Apdex Index in eqn.(6). Moreover, since the triple of numbers (s, t, f) obeys the sum rule in eqn.(7), it is a natural for plotting in a barycentric coordinate system; the Barry3 triangle. The (s, t, f) values determine the respective lengths of the barycentric arrows and thereby sets the location of the Apdex response-time profile inside the triangle. The significance of the Apdex Index can be compared with the Apdex Zones shown as diagonal colored bands in the Barry3 triangle. The numerical Apdex Rating can be read off from the right hand edge of the Barry3 triangle as shown in Figure 1.
Based on the preceding discussion, we can now make contact with the related work of other authors.
Akers [Aker08] has proposed that the standard Apdex Index be refined to be more representative of the distribution of counts within the Tolerating Apdex Zone by partitioning it into n sub-zones. An example with the Tolerating Zone of Figure 5 partitioned into n = 4 sub-zones is shown in Figure 12. The corresponding counts can be labeled with subscripts: Ct1, Ct2, Ct3, Ct4.
Figure 12: Tolerating Zone of Figure 5 partitioned into four sub-zones (pink, blue, green and red).
Example 11 Assume that the Apdex Count distribution is such that s = f = 0 and t = 1, when T = 4. The standard Apdex Index will produce:
Example 12 Now suppose that the normalized Tolerating counts (t) are distributed in such a way that they all fall entirely into the t1 sub-zone; the pink column in Figure 12. Then t1 = 1. Since s = f = 0 and t2 = t3 = t4 = 0, eqn.(11) produces:
Since the contributing counts in t1 are close to the Satisfied Zone, A*4 produces a higher index value than the standard Apdex Index in eqn.(12).
Example 13 Contrast example 12 with the case where all the Tolerating counts fall into the t4 sub-zone; the red column in Figure 12. Now, t4 = 1 and everything else is zero, so eqn.(11) produces a different value:
The lower value of A*4, relative to A4, arises from the Tolerating counts now being on the right hand side of the standard Tolerating Zone and closer to the (less desirable) Frustrated Zone.
This approach works with any number of sub-zones. Here's why. The denominator 5 is common to each fraction in eqn.(11). Therefore, when added together, the numerator becomes: 4 + 3 + 2 + 1, which is simply an arithmetic series. In general, if there are n sub-zones, the sum Sn of an arithmetic series with n terms (starting at 1) is given by
With n = 4 this, we see that S4 = 10.
That's the case for uneven counts in each sub-zone, and it works because the sub-zones act like an arithmetic series. What happens in the special case, where the count distribution is uniform, i.e., equally apportioned across each sub-zone? For a uniform distribution in the Tolerating Zone (let's call it, A*U), eqn.(11) becomes identical to the standard Apdex Index. Let's check that out.
Example 14 For T = 4 with s = f = 0, if t1 = t2 = t3 = t4 = 1/4 then eqn.(11) reduces to:
where we have used the arithmetic sum S4 = 10. Clearly, it is identical to the value we would get from eqn.(12).
More generally, for a uniform distribution of counts across n Tolerating sub-zones, eqn.(16) becomes:
In other words, when the n partitions in the Akers index are uniformly distributed, it is identical to the standard Apdex Index.
We can also see why this works visually, with ... guess what? Triangles. Figure 13 shows the Tolerating Zone with s = f = 0 and t = 1. Superimposed on this diagram are both the standard Apdex triangle discussed in Section 3.2 and the proportioned sub-zones defined in eqn.(11).
Figure 13: Modified Tolerating profile when the Apdex Counts (Cs = 0, Ct = 100, Cf = 0) are equally distributed across each of the 4 Tolerating sub-zones. Their combined area is equivalent to the standard Apdex triangle presented in Section 3.2.
The result in eqn.(16) states that the combined are of the colored sub-zones is equal to the area of the dark triangle. At first glance this does not appear to be true because there are colored portions ("overhangs") that extend outside the dark triangle. However, on closer inspection we see that the colored excess portions are little mini-triangles of slightly different size, each of which can be precisely slotted into a corresponding dark mini-triangle in another column symmetric about the vertical centerline. For example, the red excess mini-triangle in the extreme lower right of the diagram, slots exactly into the dark mini-triangle at the upper left.
Finally, I note in passing that although these additional sub-zones also satisfy a sum rule like eqn.(7), (except for 6 metrics now) they cannot be rendered visually in Barry3. Rather, it would require a Barry6 geometry and its not clear how to display such a 5D object.
In my experience, the data in Figure 2 are typical of application response times and, quite probably, this is the kind of profile that the Apdex Alliance had in mind when constructing the Apdex Index. This empirical fact unfortunately diminishes Ding's analysis [DinA08,DinB08], which assumes an Exponential response-time distribution belonging to an M/M/1 queue.
That simple distribution only holds for a single resource/server, whereas any real application (especially web applications) involves visits to multiple resources, each of which could have its own exponential residence-time distribution. A more appropriate analytic distribution, therefore, would be a sum of exponentials; which is equivalent to a Gamma distribution. The 1-parameter Exponential distribution is a special case of the 2-parameter Gamma distribution. A Gamma distribution could be shaped to data like that in Figure 2 (See e.g., Section 1.5 of [GCaP07]), but unfortunately, that would also make the corresponding mathematical analysis [DinB08] less tractable. See also Remark 3.
The Apdex Alliance has focused on defining and promoting the Apdex Index as a single performance indicator for executive management. The steps for producing and ranking the Apdex Index are shown in Figure 4. Although there is nothing new or wrong about that objective, it has not stopped others [GunJ07,GunJ07,Aker08,DinA08,DinB08] from pulling the Apdex definitions apart to see how they work, as well as thinking about other representations and applications. This article constitutes yet another example, and let me emphasize again that it has not been my goal here to try and improve upon the Apdex definitions in any way. That's a different topic.
The situation reminds me of what happened when Apple brought out the Macintosh computer in 1984. Steve Jobs insisted that the Mac was so simple to use (compared with the then IBM PC), that it should be regarded as an appliance. Like a kitchen toaster, you buy it, plug it in and it just works. You never need to think about how it works or modifying it. Despite the fact that Steve Jobs is still a marketing genius, he was utterly wrong on this point. As soon as the Mac hit the streets, tech-heads immediately pulled it apart and discovered new applications that Jobs never even dreamed of.
As I've tried to demonstrate in this article, the Apdex Index has many more interesting aspects than one might imagine based on its simple definitions and all of them can be easily understood in terms of triangles:
In Section 3.2, we showed that the Apdex Index (AT) is best regarded as a figure of merit defined by a ratio of areas in the sense of Monte Carlo sampling. The numerator of the Apdex Index represents a kind of idealized profile that response time samples are expected to possess. In the best case, they should all fall in the Satisfied Zone. See Figure 6(A), for example. Any samples in the Tolerating Zone should, ideally, decrease monotonically to zero at F; the start of the Frustrated Zone. See Figure 6(D), for example. The total number of measured response-time samples is compared with this idealized profile to produce the AT value.
Theorem 1 allows us to summarize all this as:
The Apdex Index is a figure of merit which scores the percentage of application response-times that conform to its adaptive areal profile.
Given that the current economic climate has largely been brought about by executive management decisions, it's hard to see how they deserve much respect, but that doesn't mean they're stupid. Even an executive manager looking at a single Apdex Index that is Poor or Unacceptable is likely to ask "Why?" This question may take on added significance if the Apdex Index was Good yesterday. You can't address that question with a single number. However, the triple of numbers (s, t, f), defined in Section 4, is also available. After all, they are used to construct the Apdex Index in the first place, and those data can be displayed together in a small visual area like Figure 1. You can see it all at once! [GunJ07]
The alternative figure of merit [Aker08], reviewed in Section 5.1, is certainly worthy of consideration by the Apdex Alliance. As with all performance metrics, however, it involves trade-offs. Its author notes that eqn.(11) is more complicated than eqn.(6). In my view, only moderately so, and I would expect any additional calculation overhead to be entirely negligible (if done right). The more important question is, does the additional complexity in the Akers extended definition provide better information regarding application response times?
Whether or not the Apdex Alliance ultimately incorporates any of the insights discussed in this article, by taking a more inclusive approach to the structural information already contained in the Apdex Index, the Alliance might reach a broader audience and thereby avoid the mistake that Steve Jobs made in 1984.
A. Akers, "Is Apdex Sharp Enough?," Session 358, Apdex Symposium, CMG 2008
(Available as http://www.apdex.org/documents/Session358.1Ackers.pdf)
Y. Ding, "An Analytical Model for Apdex," Session 358, Apdex Symposium, CMG 2008
(Available as http://www.apdex.org/documents/Session358.2Ding.pdf)
Y. Ding, "An Analytical Model for Application Performance Index," Proc. CMG Conference, 2008
N. J. Gunther and M. F. Jauvin, "Triangulating the Apdex Metric with Barry3," Session 45A, Apdex Symposium, CMG 2007
(Available as http://www.apdex.org/documents/45APerformanceDynamics.pdf)
N. J. Gunther, "On the Application of Barycentric Coordinates to the Prompt and Visually Efficient Display of Multiprocessor Performance Data," in Proceedings of Sixth International Conference on Modeling Techniques and Tools for Computer Performance Evaluation, eds. R. Pooley and J. Hillston, 67-80, September, Edinburgh, Scotland, Antony Rowe Ltd., Wiltshire, U.K., 1992
N. J. Gunther and M. F. Jauvin, "Seeing It All at Once with Barry," Proc. CMG Conference, 2007
N. J. Gunther, Analyzing Computer System Performance with Perl::PDQ, Springer-Verlag, 2005
T. Põder and N. J. Gunther, "Multidimensional Visualization of Oracle Performance Using Barry007," Proc. CMG Conference, 2008
Card, S. K., Robertson, G. G., and Mackinlay, J. D., "The information visualizer: An information workspace," Proc. ACM CHI'91 Conf. (New Orleans, LA, 28 April-2 May), 181-188, 1991
File translated from
On 4 Feb 2009, 12:22.