Barry3, Apdex and R:

April, 2009
by Stephen O'Connell

After reading Neil Gunther's article last month on the Apdex Index and applying the principles of Barry Centric plotting [1], I became interested in this approach for a couple of reasons::

  1. The Apdex index provides a way to normalize transaction response time data into a single number.  Neil points out the problems with this normalization, so I won't dwell on it here.
  2. The Barry3 approach to presenting the index provides a good visual for displaying the consolidated information, especially when animated over time.

I am working on a project were we are collecting response time data from scripts running on virtual desktops (VDI) running in a Vmware infrastructure.   We are currently applying a pass/fail threshold to this data.  Either the script finished under the threshold and passed, or it failed.  We will be applying these scripts to a load test in the near future of a new VDI infrastructure.  I have been considering different approaches to using the scripts and resulting response time data to assess the new infrastructure.

I saw a possible application of the barry3 apdex approach in presenting the resulting performance data. 

The key was finding a way to create the visuals from my own data.  Neil has implemented his techniques in Mathematica, however, I do not have access to that tool, nor the time to learn it.  And I know the suite of tools we have do not include this kind of analysis/functionality.  So I decided to create my own implementation using R.

After being introduced to R by Jim Holtman at Neil Gunther's Guerilla Data Analysis Techniques (http://www.perfdynamics.com/Classes/Outlines/gdata.html) class this past summer I have been using R almost exclusively to analyze and report performance and capacity information.  I have found it a very powerful language/tool for quickly analyzing and presenting data.  For CMG members, Jim Holtman has also written an excellent introduction to R and its use in the analysis of system performance data [2]

One of the very powerful things about R is just about anything you can think of has already been done in R, or almost done.  In this case there was no "apdex" charting tool, however, there were a couple of ternary plotting functions.  Another really nice thing about R is when you find a function that you like; the source is available to you.  In this case I found a ternary plotting function that was part of the vcd package [3].  This function was pretty close to what I wanted.

Over the course of a couple of evenings I was able to modify that function to apply the Apdex bands, annotations, and data structure to produce a barry3.apdex data.  Neil was kind enough to share the data used into his demo so I was able to recreate his results.

The barry3.apdex function will produce a series of charts like to following based on input data which has been normalized to s,t,f:

I have included the data and source files in this article.  The source and data files are also available for download at:

[barry3_apdex_files.zip]

TESTING BARRY3.APDEX:

  1. If you have not already installed the vcd package, you will need to install the package for this demo to work.
  1. Create a test data file using the data contained in "TEST DATA FILE.  This file needs to be located in c:\temp.
  1. Open R Gui session
  1. Copy and paste all SOURCE 1 into the Gui session.  This will load the barry3.apdex function and then execute a loop that will generate an apdex chart for each day of data in the test data file.
  1. Copy and paste the SOURCE 2 into the Gui session.  This script will loop over the test data file and produce a jpeg file for each chart.  This can be useful in creating a standalone animation of the apdex charts in time series.

TEST DATA FILE:

The following is some test data you can put in c:\temp\apdex_data.csv:

day,state,s,t,f 1,CA,0.641,0.297,0.062 1,CO,0.977,0.023,0.000 1,FL,0.933,0.059,0.007 1,MN,0.930,0.070,0.000 1,NY,0.953,0.047,0.000 2,CA,0.691,0.237,0.072 2,CO,0.954,0.046,0.000 2,FL,0.874,0.119,0.007 2,MN,0.962,0.038,0.000 2,NY,0.923,0.077,0.000 3,CA,0.777,0.191,0.032 3,CO,0.954,0.046,0.000 3,FL,0.949,0.045,0.006 3,MN,0.980,0.020,0.000 3,NY,0.902,0.092,0.007 4,CA,0.717,0.237,0.046 4,CO,0.905,0.095,0.000 4,FL,0.916,0.071,0.013 4,MN,0.916,0.065,0.019 4,NY,0.934,0.066,0.000 5,CA,0.703,0.234,0.063 5,CO,0.961,0.039,0.000 5,FL,0.955,0.045,0.000 5,MN,0.974,0.026,0.000 5,NY,0.911,0.089,0.000

SOURCE 1:

The following is the barry3.apdex function, and code to read the test data file and generated the apdex charts from the test data:

#--------------------------- START SOURCE 1 ---------------------------------------- #------------------------------------------------------------------------------------- # barry3.appex: # This function was created by trimming down the functionality of # ternaryplot which is part of the vcd package. # The specific apdex bands and labeling were added, and the orientation # of columns was also changed to support s,t,f as the input # #------------------------------------------------------------------------------------- require("vcd") barry3.apdex <- function (x, scale = 1, dimnames = NULL, dimnames_position = "corner", dimnames_color = "black", border = "black", labels="none", bg = "white", pch = 19, cex = 1, prop_size = FALSE, col = "red", main = "Barry3 Apdex Plot", newpage = TRUE, pop = TRUE, center_label = "none", ...) { dimnames_position <- match.arg(dimnames_position) if (is.null(dimnames) && dimnames_position != "none") dimnames <- colnames(x) if (is.logical(prop_size) && prop_size) prop_size <- 3 if (ncol(x) != 3) stop("Need a matrix with 3 columns") if (any(x < 0)) stop("X must be non-negative") s <- rowSums(x) if (any(s <= 0)) stop("each row of X must have a positive sum") x <- x/s top <- sqrt(3)/2 if (newpage) grid.newpage() xlim <- c(-0.03, 1.03) ylim <- c(-1, top) pushViewport(viewport(width = unit(1, "snpc"))) if (!is.null(main)) grid.text(main, y = 0.9, gp = gpar(fontsize = 18, fontstyle = 1)) pushViewport(viewport(width = 0.8, height = 0.8, xscale = xlim, yscale = ylim)) eps <- 0.01 grid.polygon(c(0, 0.5, 1), c(0, top, 0), gp = gpar(fill = bg, col = border), ...) if (dimnames_position == "corner") { grid.text(x = c(0, 1, 0.5), y = c(-0.02, -0.02, top + 0.02), label = dimnames, gp = gpar(col = dimnames_color, fontsize=15, fontface='bold')) } #TEXT gpar() gpT <- gpar(col = 'black', fontsize=10, fontface="bold") #p0 grid.polygon(x = c(0,1,.75,0), y = c(0,0,.4330127,0), gp = gpar(fill = 'grey')) grid.text(".50", x=.78,y=.4330127, just="centre", gp = gpT) grid.text("Poor", x=.78,y=.5195, just="right", gp = gpT) #p1 grid.polygon(x = c(0,.75,.65,.2), y = c(0,.4330127,.6062178,.3464102), gp = gpar(fill = 'red')) grid.text(".70", x=.68,y=.6062178, just="centre", gp = gpT) grid.text("Fair", x=.68,y=.671, just="right", gp = gpT) #p2 grid.polygon(x = c(.2,.65,.575,.35), y = c(.3464102,.6062178,.7361216,.6062178), gp = gpar(fill = 'yellow')) grid.text(".85", x=.6,y=.7361216, just="centre", gp = gpT) grid.text("Good", x=.63,y=.775, just="right", gp = gpT) #p3 grid.polygon(x = c(.35,.575,.53,.44), y = c(.6062178,.7361216,.8140639,.7621024), gp = gpar(fill = 'green')) grid.text(".94", x=.56,y=.8140639, just="centre", gp = gpT) grid.text("Excellent", x=.58,y=.8455, just="centre", gp = gpT) #p4 grid.polygon(x = c(.44,.53,.5,.44), y = c(.7621024,.8140639,.8660254,.7621024), gp = gpar(fill = 'blue')) # REFERENCE if (center_label != "none") { grid.text(center_label, x=.5, y=.15, just="centre", gp = gpar(col = 'white', fontsize=20)) } xp <- x[, 3] + x[, 1]/2 yp <- x[, 1] * top size = unit(if (prop_size) prop_size * (s/max(s)) else cex, "lines") grid.points(xp, yp, pch = pch, gp = gpar(col = col), default.units = "snpc", size = size, ...) if (pop) popViewport(2) else upViewport(2) } ## # SAMPLE CODE TO READ SOME APDEX TEST DATA AND PASS IT THROUGH THE barry3.apdex FUNCTION ## apdex <- read.csv("c:/Temp/apdex_data.csv ") colors <- c("black","magenta","yellowgreen","darkorange","purple") for (d in 1:max(apdex$day)) { Index <- apdex$day == d x <- cbind(apdex$s[Index], apdex$t[Index], apdex$f[Index]) barry3.apdex ( x, pch = 20, col = colors[as.numeric(apdex$state[Index])], main = "Apdex Ratings - barry3.apdex", labels="none", center_label=paste("Day ", apdex$day[Index][1], sep=""), dimnames=c('t','f','s'), dimnames_position="corner", bg="lightgrey" ) pch <- c(20,20,20,20,20) grid_legend(0.1, 0.70, pch, colors, levels(apdex$state[Index]), title = "State") Sys.sleep(1) } #--------------------------- END SOURCE 1 ------------------------------------------

SOURCE 2:

With a couple of small changes to the script, you can create a series of jpeg files.  I used a graphics tool to produce a Shockware Flash file from the individual jpegs to animate the data.  The shockwave file can be found in the downloaded source file.

#--------------------------- START SOURCE 2 ---------------------------------------- ## # PRODUCE INDIVIDUAL JPEGS FOR ANIMATATION ## colors <- c("black","magenta","yellowgreen","darkorange","purple") for (d in 1:max(apdex$day)) { Index <- apdex$day == d jpegOut <- paste('c:/Temp/Day', apdex$day[Index][1], '.jpg', sep="") jpeg(file=jpegOut, width=8, height=8, units='in', res=72) # create HTML output x <- cbind(apdex$s[Index], apdex$t[Index], apdex$f[Index]) barry3.apdex ( x, pch = 20, col = colors[as.numeric(apdex$state[Index])], main = "Apdex Ratings - barry3.apdex", labels="none", center_label=paste("Day ", apdex$day[Index][1], sep=""), dimnames=c('t','f','s'), dimnames_position="corner", bg="lightgrey" ) pch <- c(20,20,20,20,20) grid_legend(0.1, 0.70, pch, colors, levels(apdex$state[Index]), title = "State") dev.off() } #--------------------------- END SOURCE 2 ------------------------------------------

References:

[1] Neil Gunther, The Apdex Index Revealed, CMG MeasureIT, February 2009
(http://www.cmg.org/measureit/issues/mit56/m_56_15.html)

[2] Jim Holtman, "The Use of R for System Performance Analysis",
Proc. CMG Conference December, 2004 (Download requires CMG registration)

[3] David Meyer, Achim Zeileis, Kurt Hornik, vcd: Visualizing Categorical Data
http://cran.r-project.org/web/packages/vcd/index.html,