|
Experience in Using Microsoft’s Transaction Cost Analysis for Infrastructure Sizing
April, 2005
by Harneet Singh
Future business requirements for IT services and infrastructure must be understood and analyzed so that they meet the desired quality of service parameters such as performance and scalability. This process requires a comprehensive framework that considers business requirements to recommend Infrastructure Sizing. In this attempt, various approaches have been developed over the years, and Microsoft’s Transaction Cost Analysis (TCA) [1] [2] is one such popular technique. In this article, we share our experiences in using this technique for Infrastructure Sizing of an enterprise application. We also address some real-life issues and requirements that need to be addressed independently.
Overview
Today, with competition a mere click away, performance of business applications has assumed paramount importance. It is no longer sufficient to just have an online presence; one needs to ensure that the IT systems meet (and possibly exceed) the twin requirements of ensuring business productivity and end user needs. An appropriate infrastructure configuration is, thus, crucial in this regard. While an undersized system could buckle under peak business workloads, oversized systems with a surplus of unutilized resources would impact the Total Cost of Ownership (TCO) negatively.
Over the past few years, a couple of approaches have emerged for Infrastructure Sizing. These can be broadly categorized as follows:
(a) The first of these is based on benchmarking the application with the operational workload and determining its ability to meet the service quality to recommend the infrastructure. This is done by an actual measurement cycle to determine the adequacy on a particular hardware and software configuration.
(b) The second approach relies upon using the industry-standard benchmarks like TPC [3] and SPEC [4] to come-up with indicative hardware for the commercial workload. Here, the system designer is required to identify a commercial benchmark that best represents the type of the workload (OLTP, decision support system etc), the platform upon which it is implemented (e.g. java benchmarks) and the choice of the hardware vendor upon which it is expected to be deployed. The results of this process are used to identify the appropriate hardware and software configuration. However, there are innumerable problems with this approach, as the application scalability characteristics usually do not match the ones disclosed in the commercial benchmark.
These two approaches fall short in their ability to provide the designer with an accurate way to perform a "what-if" analysis to identify the hardware required. This led to the emergence of a third category of predictive approaches which are based on the notion of "measuring" or "estimating" cost of each transaction. This approach quantifies, for each transaction, the demand for a particular system resource in a given underlying infrastructure. In other words, each transaction is associated with a cost, or service demand that represents how intensive it is on the system resources. We now give a quick background on TCA before moving on to the actual case-study.
Introduction to TCA
Microsoft TCA begins by defining the usage profile. The profile represents a set of transactions each user would fire with a frequency that matches the operational workload mix. Using the analytical techniques based on the Utilization Law [5], the cost for the usage profile is derived and recommended. While using TCA for a new application we have to substitute the cost determination by reference model selection. The broad steps in the overall TCA approach are given below:
Step 1: Get the usage profile for the target application
The starting point is to determine the operational workload profile, which is a mix of the business transactions during a representative time frame. For an existing system undergoing changes, this can be obtained by examining historic user log information (such as web server access logs). However, it is always better to get or validate this information from the relevant business and technical stakeholder.
Step 2: Identify the reference model for the target application
TCA is a very resilient framework but heavily dependant on the availability of the reference model. By reference model here, we mean a system available publicly with a set of transactions and costs associated with them on a given hardware configuration. Organizations can create their own reference models and use them for future needs. The criteria for selecting a reference model for the existing application should be based on the architecture and the business transactions that are used in the application. Some well known reference implementations exist for Microsoft Commerce Server 2002 [6] and Commerce Server 2000 [7]. The key lies in the availability of a reference implementation that matches your application’s characteristics - i.e. preferably with a similar functionality, and the product. Based on an expert’s judgment, a risk factor will be considered whenever there is a disparity between the reference model and the application at hand.
Step 3: Estimate capacity for the target application
If a reference model is found then the target application’s workload components are mapped to suit the reference application’s workload or vice versa. Hence, the cost of transactions for the target application is approximated to suitable ones in the reference application. Using this, the cost for the entire target usage profile is calculated, with an underlying assumption that every user will have the same usage profile.
Step 4: Capacity determination for the target application
Every reference model also has a baseline configuration on which the cost for each transaction is estimated. For the target application, we can now determine whether this baseline configuration can support the desired target workload. If the capacity is not sufficient one would need to scale-out the machines, which is the recommended option. However, what happens if this baseline configuration is not desirable, and alternative hardware needs to be used? Clearly, even though TCA does not recommend a way, one would need to use industry-benchmarks (e.g. TPC [3], SPEC [4]) to determine a suitable hardware configuration.
With this introduction to TCA, we now proceed to explaining the case study and detailing some practical issues we faced and how we mitigated those.
Case study
The case study we will consider is of a retail outfit company that specializes in wedding apparel. The current application provides a Web-based front-end for its customer to order through the Internet. The current system was developed using Microsoft ASP with a backend database of Microsoft SQL server. In the current deployment, there were two web servers. This web farm hosted seven other applications for the company. The application under consideration was deployed on two web servers which were network load balanced. There were several business and technology drivers impacting the company. To improve its IT efficiency, one of the recommendations was to migrate to Microsoft’s Commerce Server 2002 framework which provides pre-built components for managing the eCommerce site. In order to give sufficient lead time, the hardware infrastructure had to be recommended much before the application was fully developed. This infrastructure recommendation had to be arrived at using information such as a business use case, expected service level and operational workload profile for the proposed system. As discussed in the earlier section, to do a predictive infrastructure sizing we resorted to using TCA, and used the reference models for Microsoft Commerce Server 2002 [6]. From workload and requirement analyses we found out that current number of concurrent users is 500. With the business projection of 400% over the following two years, we conclude that the number of concurrent users will become 2000. The steps involved throughout infrastructure sizing are discussed below:
Step 1: Get the usage profile from the existing system, and forecast the target workload taking into account business growth projection.
Web server access logs were sampled for a period of one month, by merging the access logs from the two web servers. Using tools such as WebLog Expert [8], we could estimate the (a) interval to sample (b) number of concurrent users during the interval and (3) the mix of the transactions during the representative interval. Figure 1 below shows the arrival distribution of the hits to the site for one peak day in the month. Once the peak day was identified, the next step was to identify the peak duration for which the analysis had to be carried out. As seen in Figure 1, the time frame between 14:00 and 18:00 was one peak usage period. This was recommended as the representative timeframe by the business stakeholders. All our further analysis was carried using this time frame.

Figure 1: Preview of hits to the Homepage
Table 1 shows the derived number of successful operations in a session for each business transaction, where a session is the duration for which a user was active. For example, for the transaction "Product Detail" a user performs this operation 5.40 times in a session. These values were derived based on an average user base and for a given time frame; the larger the time frame, the more realistic will be the data.

Step 2: Map the business workload of the application with reference model
We started with taking the reference model for Microsoft’s Commerce Server 2002 [6] which has the reference application of an e-Commerce site for a book store. Table 2 shows the mapping between the transactions and the cost of transaction for the reference application based on baseline configuration [6]. Since the reference model testing was done on a Pentium III Machine we used the notation of P3MC (number of mega cycles on a Pentium III).

Step 3: Estimate Capacity for the target application
In this step of TCA the cost for entire target usage profile is calculated, using the inputs available in Table 1, Table 2 and the information on the average session length for each user. For this particular case study, we found out the average session time was 5 minutes through an analysis of the log files. Table 3 shows the number of operations performed by users on a per second basis.

Using the data from Table 3 and Table 2, we derive the total cost for one usage profile to be 0.907 P3MC at the Web server; similar values are displayed in Table 4. Given that the expected number of concurrent users is 500, the cost for the target workload calculated for Web server is 453.92 P3MC.

Note: Concurrent users means the number of users using the system with the usage profile defined in Table 1.
Step 4: Determine suitable hardware configuration and its layout for the target application
This step begins by estimating the number of concurrent users the baseline configuration can support, assuming the usage profile of the target application. If the system is not able to support the target workload requirement, then the required cost for the target application is computed. This cost is the product of the cost of the usage profile and the number of concurrent users the system needs to support. In this case study, we determined that the baseline configuration was insufficient. The projected cost was 1815.69 P3MC which essentially meant that the Web server had to be scaled-out by a factor of 2.26, as indicated in Figure 2. Similarly, for the database server we estimated that the baseline infrastructure was sufficient.

It was also important for us to look at newer hardware configurations. Our preference here was to use Xeon processors. To find an equivalent server, we resorted to using the TPC benchmark, and with the server utilization of 50%. The recommended infrastructure is mentioned below:

Conclusions
This paper presents the experiences of using TCA for infrastructure sizing for a new system under development. It proceeds by identifying the computing power required of servers based on the costs of individual transaction. It builds on the concepts of transaction costs and usage of reference models from TCA. The concepts described in this paper have been substantiated with the help of a case-study of infrastructure sizing for a retail outlet.
References
- Using Transaction Cost Analysis for Site Capacity planning:
http://www.microsoft.com/technet/prodtechnol/comm/comm2002/plan/cs02tcas.mspx
- Capacity Model for Internet Transactions:
http://www.microsoft.com/technet/prodtechnol/comm/comm2002/plan/capmodit.mspx
- Transaction Processing Performance Council:
http://www.tpc.org/
- Standard Performance Evaluation Corporation:
http://www.spec.org/
- Quantitative System Performance
Computer System Analysis Using Queueing Network Models
Edward D. Lazowska, John Zahorjan,
G. Scott Graham, Kenneth C. Sevcik
- Commerce Server 2002 SVT site performance characteristics: Transaction Cost Analysis:
http://www.microsoft.com/technet/prodtechnol/comm/comm2002/maintain/perform/svt_tca.mspx
- Commerce Server 2000 SVT site performance characteristics: Transaction Cost Analysis:
http://www.microsoft.com/technet/prodtechnol/comm/comm2000/maintain/perform/svtsite.mspx
- WebLog Expert:
http://www.weblogexpert.com/
Last Updated 04/15/05
Home |
Conference |
Groups |
National |
Members |
Links |
Site Map
|