Conceptual Models of Enterprise Applications as Instruments of Performance Analysis
The article introduces enterprise applications conceptual models that expose performance related cause-effect relationships distilled of innumerable application particulars concealing the roots of performance issues. The value of conceptual models for performance analysis is demonstrated on two examples of virtualized and non-virtualized application conceptual models.
In the course of performance analysis of enterprise applications the biggest challenge to overcome is their tremendous complexity. Today’s distributed applications are comprised of the tanglewoods of physical and logical objects interacting in a very intricate manner. The applications implement multiple rules, algorithms, and protocols while serving communities of concurrent users generating fluctuating workloads. That makes application performance analysis extremely cumbersome and time consuming, particularly taking into account that each enterprise application features hundreds of tuning parameters on system and application levels.
System level tuning parameters shape the management policies of the operating systems. Microsoft offers 112 pages document “Performance Tuning Guidelines for Windows Server 2008 R2” (http://tinyurl.com/qx4v4gy). AIX operating system tuning guide by IBM is even fattier – it has 744 pages (http://tinyurl.com/o3b66o8).
Application tuning parameters control allocation of system resources as well as configurations of application internal logical objects like software threads, connection pools, etc. Application vendors publish various tuning documents to help optimize their products. A few examples are: “Oracle® Fusion Middleware Performance and Tuning Guide” (http://tinyurl.com/kurmd9p), “Oracle® JRockit Performance Tuning Guide” (http://tinyurl.com/mggv55j), “Oracle® Fusion Middleware Performance and Tuning for Oracle WebLogic Server” (http://tinyurl.com/panje7g). All publications expose broad assortments of tuning variables, but they do not provide any methodological guidance on efficient and targeted execution of performance tuning activities.
Is it possible to perceive an application in a way that scales down its complexity (similar to taking an aerial view of the Earth)? In other words, can we conceptualize an application to abstract from numerous details and concentrate only on its objects and their relationships that reveal performance related cause-effect connections?
This article provides a positive answer by advocating enterprise application conceptual modeling as an instrument of application performance analysis.
Application conceptual model exposes the relations between demand for services generated by transactional workload and supply of system resources intended to satisfy such a demand. By devising a conceptual model we mentally leap into specific perspective on application; that perspective highlights system resources, their capacities, as well as demand for system resources from transactional workload.
The representations of the conceptual models can differ (charts, tables, matrixes, text descriptions, pictures), but they all include the constructs describe in the next paragraph.
We need the following constructs to build the conceptual models that enable application performance analysis
1. Transactions – characterize user requests for application services. If we visualize a transaction as a physical object (for example, a car), we can create in our mind an image of a car-transaction visiting different system servers and spending some time in each one while being processed. The transaction examples are:
- Financial Report - Activity Report
- Data transformation
- Business rule execution
2. Resources – each system hardware and software component allocates its resources to transaction processing. In general, the following component’s assets can be allocated:
- CPU time
- I/O time
- Network time
- Software connections (for example, Web server connections, database connections)
- Software threads
- Storage space
- Memory space
- Software locks
Active resources implement actual transaction processing and data transfer. Passive resources provide access to active resources. A consumption of an active resource by a single transaction is measured by the time interval it was serving a transaction. A metric for a passive resource usage by a single transaction depends on a passive resource type: for software connections and threads it is a number of connections and threads; for memory and storage it is a size of allocated memory. A transaction waits for a resource if other transactions consume all resource capacity.
3. Physical nodes - symbolize physical servers, appliances, and networks. One physical node consists of active and passive resources of one physical server. Each resource has a queue where transactions are waiting while that resource is unavailable.
4. Virtual nodes - represent virtual machines (VM). One virtual node represents one VM and consists of passive resources and a block indicating a waiting period for a time slice to be allocated to VM by a host’s hypervisor. VM is not running when it is waiting for a time slice.
5. Nodes with shared hardware resources – a depiction of a physical host shared active and passive resources:
6. Transaction-resource matrix (TR matrix) - defines active and passive resources each transaction receives the services from while being processed by application. As an example, in a matrix below a character “X” identifies the resources of three-tiered application consumed by each transaction.
7. Users – the source of transactional workload. An application has to have sufficient resource capacity to process transactional workload according to required service level. We can picture an interaction between users and application as below (the cars are transactions metaphors):
The user-generated transactions are processed by the application; after completion the results are sent back to the user. Between two subsequent transactions generated by the same user there is a time interval (called a “user think time”) that defines an intensity of transactional flow.
Using described constructs, we can model non-virtualized three-tiered application as shown on Figure 1. The application consists of Web, Application, and Database servers; each one is hosted on dedicated physical hardware.
A transaction starts its journey when a user clicks on a menu item or a link that implicitly initiates a transaction. In a model it means that a transaction leaves a block “users”. After that it gets processed in the nodes “WEB server”, “Application server”, “Database server”. Each transaction will request passive and active resources as described in TR matrix. If any of the needed resources is not available, a transaction will wait in the queues. At the end of its journey a transaction comes back to a user. The total time a transaction has spent in all nodes is a transaction response time. The model makes obvious that this time has two components: a processing time by the active resources and a waiting time for the passive and active resources.
A conceptual model of the same three-tiered virtualized application deployed on one physical host can be seen on the figure below.
Our conceptual models highlight immediately two major differences between non-virtualized and virtualized deployments:
- A virtualized set up has resources shared among a number of VMs.
- Each VM can use its own passive and host’s shared resources only when hypervisor allocates a time slice to VM.
Those particulars of virtual deployment have profound performance implications we discuss in the next paragraph.
Let’s consider what can cause a delay in a transaction processing in non-virtualized application deployment. As a conceptual model on Figure 1 suggests, a transaction delay can be cause by two circumstances – a shortage of an active resource capacity and a limited access to an active resource because of the shortage of passive resources.
For example, in order to process a transaction in a server’s CPUs, an application has to request and to receive a particular memory space. But what will happen if a memory is not available? Obviously, a transaction will wait in a queue. This means that a transaction processing is delayed as a result of a limited access to CPUs even if they are not fully utilized. In general, a shortage of the passive resources causes underutilization of the active resources.
To identify where the bottlenecks can potentially take place, we have to monitor not only active resources, but also the passive resources. Among latter there are physical ones (memory and disk space), as well as the logical programmatic objects (software threads, connection pools, locks, semaphores, etc.). Monitoring passive programmatic resources requires knowledge of application architecture and software internal logistics.
The model on Figure 1 reveals that the bottlenecks in our application might happen when there are insufficient CPU and I/O capacities in Web, Application, and Database servers, as well as when the servers have limited memory, application spawns insufficient software threads, or connection pools are poorly tuned. Indeed, low network throughput also can cause the bottlenecks.
The conceptual models of virtualized deployment include time slice constructs as well as the resources shared among VMs hosted on the same physical host.
The time slice constructs point to the very important fact that, in order for a VM to process transactions, it has to receive from the host’s hypervisor particular time interval to get an access to host’s physical hardware. The time slice allocation algorithms vary among virtual platform from different vendors. But the commonality among all algorithms is that a time slice gets allocated to a VM only when a demand from VM for a specific shared resource is met by resource availability. If that is not the case, than VM will wait and will not process transaction.
One significant implication of a host’s hardware time sharing is a flawed timekeeping by VMs. A detailed discussion can be found in [1, 2]; here we just point at its fundamental reason: in a virtual environment hypervisor treats guest VM as a process that can be stopped and resumed at any time. When guest VM is stopped, it cannot accept and count time interrupts from a hardware clock. That means guest VM misses time intervals which makes reported by VM time-dependent performance metric not representative. That affects the most popular performance counter groups:
- The counters reporting utilization of system resources during some time interval (for example, percent of total CPU utilizations, percent of CPU utilization by particular process, percent of physical disk utilization).
- Resource throughput measured in a number of operations executed by a resource during particular time interval (for example, network throughput measured in bytes/second, number of I/O Reads/second).
Taking into consideration this fact, what are the performance counters that report non skewed data in virtualized environments? To find it out let’s refer to our conceptual model on Figure 2. Each resource depicted on the model has a waiting queue; acquiring a time slice also requires VM to wait in a queue. The queues are the major phenomenon defining system performance, because waiting time in the queues adds up to the time a transaction is processed by system resources. A queue is an indicator of imbalance between demand generated by fluctuating user’s workload and availability of system’sresources to satisfy the demand. As such, while analyzing a performance bottleneck, it is necessary to find out where in a system the queues are building up and exceeding the acceptable thresholds. Because instantaneous counts of the queue lengths do not depend on implementation of a system timekeeping mechanism, monitoring resource queues delivers representative performance metrics for virtualized environments.
A performance impact of resource sharing depends on the specifics of application and VMs deployments on a physical server.
A physical server can host single application installed on a few VMs as shown on Figure 2. A transaction that has to be processed on all tiers (Web, Application, and Database), will travel among different VMs. A transaction transition between VMs increases transaction response time because of delays in time slice allocations to each VM. In such a deployment the application’s software components residing on different VMs compete for shared host resources.
Another deployment schema features a few applications each one fully installed on a dedicated VM.
The conceptual model implies that, in this case, a full cycle of a transaction processing can occur during single time slice minimizing transaction waits for time slices. Still there is a competition for shared resources; however, in this case it is among different applications, but not the software components of the same application.
The conceptual models expose application fundamentals distilled of innumerable application particulars that conceal the roots of performance issues. The value of conceptual models for performance analysis is confirmed by the following findings that are based on examination of the article’s models:
- A transaction delay can be caused by two circumstances – a shortage of an active resource capacity and a limited access to an active resource because of the shortage of passive resources.
- A shortage of the passive resources causes underutilization of the active resources.
- To identify where the bottlenecks can take place, we have to monitor not only active resources, but also passive resources.
- In virtual environment a time slice gets allocated to a VM only when a demand from VM for a specific shared resource is met by the resource availability. If that is not the case, than VM will wait and will not process transaction.
- Reported by VM time-dependent performance metric is not representative since a guest VM misses time interval counts.
- Because instantaneous counts of the queue lengths do not depend on implementation of a system timekeeping mechanism, monitoring resource queues delivers representative performance metrics for virtualized environments.
The application conceptual models are indispensable instruments streamlining performance analysis. The conceptual models point to the facts that performance degradation happens when a resource does not have sufficient capacity or access to the resource is limited. In both cases processing of a transaction will be delayed as transaction will be placed into a waiting queue.
Queuing is a major phenomenon defining application performance, but conceptual models cannot quantitatively asses its impact. If we want the numerical estimates of system capacity that delivers application performance according to a service level agreement, we have to build upon application conceptual models their queuing models. The book  can be a guide in that journey.
1. VMware document: “Timekeeping in VMware Virtual Machines” http://tinyurl.com/2e9tdnv
2. Bernd Harzog. “White Paper: Application Performance Management for Virtualized and Cloud based Environments” http://tinyurl.com/mphrjae
3. Leonid Grinshpan. Solving Enterprise Applications Performance Puzzles: Queuing Models to the Rescue, Wiley-IEEE Press; 1 edition, 2012 (http://tinyurl.com/7hbalv5)