CMG Home

Site Map Links Members Only National CMG Groups Measure IT International Conference

MeasureIT
 In This Issue
 
From the Editors

Articles >

Forecast Generation

I/O Virtualization

Measurement for Maturity (Part 2)

Capacity Utilisation

CMG News >

'07 Program Update

Press Release (05/31/2007)

Press Release (06/18/2007)

Region News >

Philadelphia

New York

Events >

Calendar

 Article Database
 Resources
 Industry Articles
 Submit Article
 SubscribeIT
 RemoveIT
 Letter to Editor
 About MeasureIT
 Contact Us
 
MeasureIT

Guerrilla Capacity Planning
PART II: Weapons of Mass Instruction1
June 1, 2003
by Neil J. Gunther

About the Author
Neil J. Gunther, Performance Dynamics ConsultingSM

Neil J. Gunther, M.Sc., Ph.D., SMIEEE, is an internationally recognized IT researcher and author who founded Performance Dynamics Company (www.perfdynamics.com) in 1994. He is well-known to CMG audiences for his conference presentations since 1993 and his popular articles in CMG MeasureIT. Dr. Gunther was awarded Best Technical Paper at CMG'96 and received the A.A. Michelson Award at CMG'08.

Prior to founding Performance Dynamics, Dr. Gunther held teaching, research and management positions at San Jose State University, JPL/NASA, Xerox PARC and Pyramid/Siemens Technology. His "Guerrilla Capacity Planning" training-classes have been presented world wide at both corporate and academic institutions including: AOL, Boeing, FedEx, Motorola, Nokia, Stanford Univ. and UCLA. He is a member of AMS, APS, ACM and SPIE. More details are available on his Wiki page.

[Hide]

1  Introduction

This is the second part of my article on Guerrilla Capacity Planning for MeasureIT.

kongtools.gif
Figure 1: Typical "guerrilla" planner.

The key idea from Part I is tactical planning whereby capacity planning is executed in an opportunistic way such that management schedules are not inflated. It's that last constraint that many performance analysts and planners overlook (possibly at their peril).

Attribute Traditional Guerrilla
Budget Big None
Rank Titled No badge
Tools Big Tiny
Range Strategic Tactical
Approach Passive Proactive
Impact Inflationary Stationary
Scope Routine Opportunistic
Reporting Expected Unexpected
Skill set Narrow Diversified
Focus Hardware Applications

Table 1: Comparison of Old and New.

Table 1 summarizes ten ways in which guerrilla capacity planning differs from traditional capacity planning. As we indicated in Part I, tools alone do not a guerrilla planner make. In fact, the tools can be quite easy on your manager's budget (see Table 2).

Weapon Description
CSIM Moderately priced simulation package.
EXCEL It's not just a spreadsheet; it's a programming environment!
Mathematica Commercial symbolic computation environment.
MatLab Commercial computation package.
Minitab Commercial statistical package; a big step above EXCEL.
Net-SNMP Free-ware data acquisition using SNMP MIBs.
Octave Free-ware numerical computation package.
PDQ "Pretty Damn Quick" open source performance analyzer
R Free-ware version of commercial S+ statistical product.
SPEED Moderately priced SPE modeling tool.
TeamQuest Model Commercial data acquisition and modeling tool.

Table 2: Example guerrilla weapons.

Whatever tools you choose, it is critical to your success as a guerrilla planner that their use not inflate management schedules. Elaborate tools tend to take more time to learn and use. Consequently, they run the risk of being inflationary rather than stationary (see Table 1). As well as investing in tools, management needs to invest in human infrastructure; YOU! People do performance analysis, not tools. But management will be disinclined to invest you as an analyst or planner if you insist on inflating their schedules. It may also shorten your career.

Here, in Part II, we present some examples of Hit-and-Run methods that have been successful for guerrilla capacity planning.

2  Scalability Analysis

As a first example we briefly describe a very simple and quick method for quantitatively determining application scalability. It's noteworthy that scalability, and particularly application scalability, is a perennial hot button that involves notions of performance and planning yet, few people are able to quantify the concept.

Scalability has to do with laws of diminishing returns [Gunther 2000] but it cannot be represented by a single number. Scalability is a function like that shown in Figure 2.

These are actual load test measurements (dots) where the throughput is plotted as the number of script scenarios executed per hour (S) on the y-axis as a function of the number of virtual users (N) generating the load on the x-axis.

sdet.gif
Figure 2: Load test measurements and capacity model.

 

Superimposed on these data is the corresponding scaling function predicted by a simple capacity model that does not involved any queueing theory or simulations. This means that sizing the capacity of application servers can be accomplished very quickly using a spreadsheet tool.

The simple capacity model can be written as a formula:

S(a, b, N) = N
1 + a[(N - 1) + bN (N - 1)]
(1)

involving just two parameters. It turns out that the a parameter is identified with contention delays e.g., time spent waiting on a database lock, while the b parameter is associated with additional delays due to pairwise coherency mismatches e.g., time to fetch a cache-miss. The origin of these delays can be in hardware, software or (most likely) a combination of both.

Moreover, this function can be entered easily into an EXCEL spreadsheet. If the number of virtual users reside in column N and the regression parameters a and b reside in cells A1 and B1 respectively, equation (2) can be entered as the following cell formula:

= Nr/(1+A1*((Nr-1)+B1*Nr*(Nr-1)))
(2)

where Nr means the value in the cell at column N and row r. The scalability parameters a and b can now be determined using the linear regression tools built into EXCEL (see for example [Levine et al. 1999]). The basic procedural steps can be summarized as follows:

  1. Measure the throughput as a function of load (N) using tools like WebLoad, or LoadRunner.
  2. A sparse data sample (of at least 4 load points) is usually sufficient.
  3. Perform a regression fit in EXCEL to calculate a and b.
  4. Use those values to predict the complete application scalability function using (2).

An essential feature of this simple approach is that can predict retrograde throughputs like those measured in Figure (2). Retrograde throughput means that the amount of completed computation decreases as the load on the system increases. This effect cannot be modeled easily using conventional queueing tools (e.g., CSIM or PDQ) without the introduction of more complex load-dependent servers that have to characterized separately.

2.1  Guerrilla Victory

What are the benefits of this guerrilla sizing methodology? Apart from the avoidance of queueing theory or simulation tools, perhaps the most significant benefit is the fact that it provides a framework against which the consistency of the load measurements can be assessed. The measured data does not come from God, so there is plenty of room for error [Gunther 2002]. If the data does not fit the model in equation (1), there is very likely a problem with the measurement process that may be worth more detailed investigation [Buch and Pent. 2001].

Moreover, because each of the terms in the model has a real physical interpretation, it has been my experience that engineers from disparate groups quickly recognize which parts of the application or platform need further tuning to improve scalability. In this way, scalability can be forecast without inflating your management's release schedule and it can be applied opportunistically when the need arises.

3  Procurement Projections

The same spreadsheet model (1) can also be applied to website traffic analysis where the rapid increase in traffic growth often demands a more tactical approach to capacity planning.

Since the infamous "tech bubble" popped a couple of years ago, the number of hyper-growth websites has been reduced significantly. However, websites such as Amazon.com, eBay.com, and Yahoo.com exhibit continued growth supported by robust business plans. These web sites, and others like them, know they need capacity; it's the planning part that is culturally unfamiliar.

A useful metric that I have devised [Gunther 2001] for one of these high-growth websites is the capacity doubling period (see equation (4)). The doubling period is simply the time it takes to consume twice the server capacity that exists now. In some cases this period can be as short as six months! That's about ten times faster than typical data processing centers and four times faster than Moore's Law. Such exponential demand for server capacity can lead to yet another definition of bankruptcy. Even if you are planning to purchase a lot of cheap servers, pretty soon you're talking real money! This forces the need to plan capacity well in advance of the procurement cycle.

In typical guerrilla fashion, the capacity doubling period can be determined with the use of elementary tools like spreadsheets. If, for example, the processor utilization (U) is measured at regularly scheduled intervals, the long term consumption can be estimated by assuming an exponential trend model.

Ufuture = Unow   elW
(3)
where l is the growth rate determined by using the Add Trendline facility in EXCEL and W is the number of weeks over which the data is being fitted. The doubling time is then given by,

Tdouble = ln(2)
l
(4)

The details of this aspect would take us too far afield but the interested reader can learn more in [Dumke et al. 2001].

An exponential growth-model was chosen because it is the simplest function that captures the notion of compounded growth. It is also reflective of supra-linear revenue growth models. If you decide to get more involved in statistical models, you might want to consider using more robust tools like Minitab or R (see Table 2).

3.1  Guerrilla Victory

The final task is to translate this exponential growth-model into procurement requirements.

procurves.gif
Figure 3: Projected capacity consumption over 32 week period.

Since the exponential model only pertains to the measurements on the current system, we need a way to extrapolate to other possible system configurations. For that purpose, we can use the scalability function in (1) to generate a number of matching trendlines (see Figure 3) corresponding roughly to the optimistic and pessimistic, errr ... realistic, sides of an envelope inside which the actual performance should be found to reside.

According to Fig. 3, if nothing is done then the current capacity is projected to be out of gas (permanently 100% busy) around week eight (upper curves). By upgrading to faster processors, tuning the application, etc., the inevitable can be forestalled for about twenty odd weeks or about two fiscal quarters (lower curves).

At the very least, generating plots like these (quickly and periodically) keeps management aware of where they are (or ought to be) in the procurement cycle.

4  Guerrilla Guidelines

As I hope these examples demonstrate, guerrilla capacity planning does provide an approach to assessing application scalability that matches management's requirement to keep a tight rein on project schedules.

In many shops where there is a tendency to avoid traditional capacity planning, the guerrilla approach can provide management with a simple framework whereby disparate groups can be brought together and unforeseen performance issues revealed. Once revealed, they can then be addressed within the context of current schedules. In this way, guerrilla capacity planning can help to keep projects on schedule and minimize revisions. Think of it as a way of managing hidden time sinks.

It's also a way of replacing risk perceptions with risk management. As I pointed out in Part I, if management is really only looking for a sense of direction then don't bother to provide them with the detailed compass bearing. That only takes more time and will be viewed as inflationary.

Without wishing to promote any adversarial connotations between analysts and management, it can be instructive to view some of the parallels between guerrilla capacity planning and the opportunistic aspects of guerrilla warfare.

Guerrilla Warfare Guerrilla Planning
Enemy Fighter Management Analyst
Advances Retreats Demands Acquiesces
Camps Harasses Ponders Pesters
Tires Attacks Dithers Proposes
Retreats Pursues Acquiesces Promotes

Table 3: Comparison between the tactics of guerrilla warfare and guerrilla capacity planning.

Aspects of capacity planning that have not been discussed here include: floor-space, power, cooling, disk storage, tape storage, etc. Many of these issues can be addressed with spreadsheet models similar to those presented here (See Table 2).

In many ways, capacity planning is equal parts analysis and instruction. You probably need to spend as much time on how you present the results of your analysis as you do generating the results in the first place. If your management (or whoever your target audience is) fails to comprehend your point, you may as well not have started it in the first place (a notion you'd rather keep insulated from your manager).

With Figure (1) in mind, I encourage you to consider adopting guerrilla capacity planning in your organization so you can ...

Go forth and Kong-ka!

5  Acknowledgements

The author wishes to thank Mark Friedman and his usual prescience in suggesting this article for MeasureIT. It was also a pleasure to work with Rick Ralston and Cathy Nolan while transforming the original IEEE article into this online version.

References

[Buch and Pent. 2001]
Buch, D. K., and Pentkovski, V. M., "Experience of Characterization of Typical Multi-tier e-Business Systems Using Operational Analysis," Proceedings of CMG 2001, p. 671.
[Dumke et al. 2001]
"Performance and Scalability Models for a Hyper-growth e-Commerce Web Site" in Performance Engineering: State of the Art and Current Trends, (Eds.) Dumke, R., Rautenstrauch, C., Schmietendorf, A., Scholz, A., Springer Lecture Notes in Computer Science, # 2047. Heidelberg: Springer-Verlag (2001).
[Gunther 2000]
Gunther, N. J., The Practical Performance Analyst, iUniverse.com Inc. 2000. (Read it online )
[Gunther 2001]
Gunther, N. J., Lecture notes for Guerrilla Capacity Planning class.
[Gunther 2002]
Gunther, N. J., "SINS OF PRECISION: DAMAGING DIGITS IN CAPACITY CALCULATIONS", Proceedings of CMG 2002.
[Levine et al. 1999]
Levine, D., Berenson, M., Stephan, D., Statistics for Managers, New Jersey: Prentice-Hall (1999).


Footnotes:

1 Joint Copyright © 2002-2003 Performance Dynamics Company and IEEE. All Rights Reserved. Permission has been granted to CMG Inc., to publish this version in CMG MeasureIT.

Last Updated 06/05/09


Home | Conference | Groups | National | Members | Links | Site Map

Computer Measurement Group