July, 2009
by Michael Kok
Transaction Aware Performance Modelling (TAPM) allows us to analyse the performance and capacity needs of each transaction type of an information system individually. This requires special techniques for modelling and measuring. As a result extra power is added to model driven performance analysis providing capability to support System Performance Engineering (SPE). In this paper the power of TAPM is demonstrated by referring to an example of an analysis that was conducted on a newly-created business application deployment.
The analysis process conducted, included:
As a result of the performance analysis the following benefits were realized:
Jointly, considering the strategic and tactical advantages made available to the application project management and development teams early in the development life cycle, the benefits mentioned above represent a significant time and materials cost saving.
The system under analysis was a newly developed n-tiered workflow management application, entailing a Web Server, Application Server and Mainframe Server.
Application workflow was provided by an external application (Application STP) which processes business cases in an automatic (Straight Through Processing) manner. A percentage of these business cases would require "manual" processing and were transferred to the application under analysis for further processing. The business purpose of the application under analysis, was to accept these cases and allow application users to interact with them in a controlled way.
Application users would interact with the application using the following main transactions:
The following list of transaction types were selected for performance analysis:

List. Transaction List
Transaction 2 is of particular interest and has 8 sub-types (2A to 2H). Each sub-type is from the same transaction type but only differs in its preconditions. These preconditions are determined by three parameters:
The heavy use and expected impact of Transaction 2, required a number of sub-types to be analysed. This would allow for an in-depth study of the impact of the various preconditions.
Transactions 5, 6 and 8 show variants that are truly different transaction types accommodating different other applications depending on the workflow application.
![]() |
Fig. Model Overview
The analysis was conducted using the mBrace model schematically depicted above. The model requires three inputs and provides two outputs as depicted above. Usage volumes (transaction workloads) and Capacities (hardware capacity available) metrics were collected from the organization. Resource usage metrics are obtained through measurement.
The inputs provided are inserted into the model which calculates the outputs and shows the time behaviour and resource behaviour results in a graphical format. The impacts due to changes in transaction volumes and available server capacity are studied through comparative analysis of the outputs.
The displayed outputs are shown in greater detail in the next sections.
The first step in the use of the mBrace model was to define the system workload to be analysed.

Fig. Use Case Configuration
The above snapshot from the model GUI shows the use case with the transaction types that were selected for further analysis.
A use case is a clustering of transaction types. When defining a usage volume for the system we may compose it of one or more use cases. Commonly a use case is recognisable for the business. E.g. we may have a use case "Sell a policy". The business knows how many policies it sells (or intends to sell). So we can easily determine how many times the use case is executed. At the bottom right we may fill in the execution rate of the entire use case in numbers per second.
For simplicity sake all transactions were grouped in one use case.
Each transaction type may be executed once each time the whole use case is executed, but it may also be executed multiple times (multiple being smaller, equal or greater than 1). It requires accurate process analysis to determine how frequent transactions are executed within a use case. For each transaction type we may fill in the Freq column at the right. Ultimately the transaction volume for each transaction type is determined by multiplying the values under Freq with the Use case Frequency. All this taken together yields the overall transaction volume in number of transactions per second. Moreover this also yields the transaction mix. Not only the transaction volume, but also the transaction mix, is of significant influence to the performance and capacity needs of the system.
As can be seen not all transactions measured were included in the use case. The data filled in display the result of careful analysis of the expected usage. However this is a forecast with a speculative nature to some extent.
The graphic below provides a schematic overview of the application infrastructure. Web and AS were UNIX servers, Mainframe was a z/OS server.
Fig. Application Infrastructure
The test environment provided was not representative and varied extensively from the planned application production environment, but this did not prevent effective analysis. These variations included:
The configuration for the Mainframe server was defined as shown below.
![]() |
Fig. Server Resource Configuration - Mainframe Server
The Configuration window shown has a tab for each server in the infrastructure chain.
In the above screenshot the tab for the mainframe server of this window was opened to show the above picture. The window is composed of several sections:
The upper grey part shows the characteristics that are known from the test environment.
Next below we have the sections for CPU and Disk. This part displays the capacities of the Test environment, the Production Baseline and the Scaled Production environment. Here we can enter the numbers of CPU's and disks as well as their speeds. We can scale horizontally in the model by changing the number of the devices and vertically by changing their speeds.
Next below is the section Noise. Here we can enter the load imposed on the devices by other applications.
The section Utilisation holds parameters that can be used to determine how to scale up or down. If the outcome of the model shows a value of %utilisation above the High % we have to scale up to approach the target value for %utilisation of the device as close as possible.
The section Disk cache shows the %hitrate on the cache and the times to fulfil a hit or miss in seconds.
The section Network Connections holds the data for all network connections of the server.
The section Software resources at the right shows the software resources that are modelled. To show the parameters of these resources another window must be opened. This will be shown later in this document.
Each time a parameter has changed the model recalculates the results and produces new outputs.
Following the completion of the model input parameters, the model outputs were calculated and displayed.
Both outputs of the model - time behaviour and resource behaviour - are displayed in one chart of which a simple example is shown below.

What does this chart tell us? This chart tells all there is to know about the capacity need and performance of a system at one glance. Once someone is familiar with the graph reading an mBrace performance report becomes as easy as reading a book with cartoons.
The chart has three main sections:
Section 1 shows the %utilisations of the CPU's of the servers. For each server, there is one vertical bar showing the utilisation as a percentage of the total capacity of the CPU's of that server.
Section 2 shows the %utilisations of the access paths of the disks of each of the servers.
Section 3 shows the time behaviour of one transaction type. The total length of the coloured bar indicates the end to end response time on this transaction type. The length of each coloured component of the bar shows the time spent by either the activity on or waiting time for a resource.
Text balloons are added to the chart in the next picture. Note that colours of the bar are also explained by the legend at the right.
To simplify the chart, in this example only one bar is shown for one transaction type. Commonly a larger number of transactions are displayed in the chart. Sometimes the number of transaction types involved in the analysis is greater than can be displayed in one chart. Then the transaction types can be scrolled up and down. Also in many cases the transaction types are sorted in decreasing order of response times so one can see the most interesting transaction types on one page.
Transactions were executed in the test environment a number of times and their resource usage metrics were collected. These metrics were then parsed and "scrubbed" to remove outliers and secure the quality of the model outputs.
Next, transaction volume and test and production environment server capacities for each server were inserted into the model.
The mBrace model outputs reported the resource usage and response time behaviour of the application at the transaction level according to the server configurations provided in the previous step.
The response time of each transaction type was broken down to display the amount of time contributed by each separate server resource component.
The two graphs below show the outputs displayed for the test and the production environments.
The results shown below, demonstrated the transaction response time breakdown for the transactions as they were captured in the test environment. Response times indicated are for single user executions of the transactions only. Multiple user impact is considered at a later point in this paper.

Fig. Transaction Response Time Breakdown - Test Environment
Observations:
The next graphic, depicts the transaction response times if they had been executed on the proposed production environment. Response times indicated are for single user executions of the transactions only.
Note that a major difference between the test and production environments was the 4x increase in the speed of the AS CPU's.

Fig. Transaction Response Time Breakdown - Production Environment
Observations:
The output relative to production systems, also allows for comparative analysis of varying production environment configurations and varying user loads.
Now that we've set up the initial model, we can explore the predicted production results in more detail with the goal of optimizing the performance and capacity of the production environment. We'll explore that in next month's continuation of this article.