Performance Modeling and Capacity Planning in Application Performance Management (APM)
Author by Madhu Tanikella
In order to ensure that an IT system’s performance and scalability is in line with the Non-Functional Requirements (NFR) defined during the requirement gathering phase, multiple activities related to performance assessment and validation has to be carried out across the SDLC phases. This article highlights Performance Modeling and Capacity Planning in specific, their suitability to Design and Test phases, implementation know-how and their advantages.
Performance Modeling is one of the performance engineering activities that can be adopted during Design phase. The main objective is to gain insights into performance of a given IT system early-on and fix any performance issues that arise due to bad design and wrong choice of Architectural components. As we all know, it’s very expensive and time consuming to fix a bad design/architecture once the issue is identified at the very end of development. However, the system is not fully built, being in Design phase, and hence it is suggested to carry out performance modeling activity using a Prototype or Proof-of-Concept (PoC) built around the most business critical processes and refine it in later stages.
Besides, Performance Modeling can be a cost-effective solution in Test phase for following scenarios where the objective is to understand performance behavior of a given IT system without actually performing load/performance tests for the final Concurrent Load – in such cases, performance metrics can be captured for lesser concurrent loads and predict the same using Performance Models.
• Licensing cost of Load Simulation tools shoots up if application has to be certified for very high Concurrent Virtual Users (such as 2000, 5000 etc.)
• Creating a Test Environment that is an exact replica of Production in terms of Hardware Configuration might not possible for every application due to longer hardware procurement cycle or budget limitations.
Fig.1. below shows a simple representation of Performance Models built using Queuing Network Models (QNM) which take Service Demand of different Resources such as CPU, Memory, Disk, etc. as Input and generate Response Time, Resource Utilization, and Throughput metrics as the Output.
As we all know, QNM is time-tested and widely used in telecommunications, operations research and traffic engineering and can be applied to any system that can be viewed as a Network of Resources and Queues, which makes them an apt choice for Software Systems as well.
Following is the high level procedure for Performance Modeling exercise using QNM technique:
• Identify the processing layers of the IT system such as WebServer, Application Server, or Database Server and choose which one of them should be modeled.
• Identify compute-intensive resources such as CPU, Memory, DISK and Network within each layer. For instance
√ CPU and DISK are the most compute intensive resources for Databases
√ CPU & Memory are the most compute intensive for Application Server, Middleware and Web Servers
• Based on the application nature and user activity, choose a Queuing Model - OPEN or CLOSED
√ OPEN system is typically for ONLINE and MIDDLEWARE applications for which the user population is not definite and has continuous arrival pattern
√ CLOSED is for Batch Systems for which the user population is fixed.
• Capture %Utilization of CPU, Disk and Memory and calculate Service Demand of each Resource by executing a moderate load test.
• Use Service Demand and Arrival Rate as inputs to QNM models and derive performance numbers - Listed below are some of the key outputs of Performance Models.
√ CPU - %CPU Utilization, CPU Queue Length, CPU Residence Time, CPU Wait Time
√ DISK - %DISK Utilization, DISK Queue Length, DISK Residence Time, DISK WAIT Time
√ Performance Characteristics – Avg Response Time & Throughput
Figures.2, 3, and Fig.4 are the snapshots of Performance Modeling results using QNM for a Messaging Middleware application implemented in J2EE Application Server for one of the biggest Telecom service providers in UK.
√ QNM models are evaluated using data captured from 2 performance runs from a Test Environment for 2726 Msg/Hr and 5408 Msg/Hr
√ Results from Performance Models are compared with actual test results and observed that the deviation is around 3-13% for CPU Utilization, Disk Utilization and Average Processing Time per Message.
√ Also, the predicted performance from QNM Model for 13628 Msg/Hr load (without actually executing a load test) shows that DISK UTILIZATION reached to 84% and 186% on two disks respectively, which implies a serious DISK I/O bottleneck on the 2nd DISK.
√ To verify the authenticity of the model, a load test for 13628 Msg/Hr is executed and could not succeed due to 100% Disk Busy on 2nd disk that in turn caused heavy exceptions in message processing.
√ Close observation of actual data indicates that the 2nd DISK has 75% DISK Utilization even for 5408Msgs/Hr load (i.e., ½ of the Peak Load)
On the other hand, Capacity Planning refers to the process of determining hardware capacity required for a given IT system to meet the current/future workload so as not to impact IT system’s performance and scalability. Typically, this activity first takes place in Test phase before rolling out a system to Production where Hardware Capacity should be decided for future concurrent load. Thereafter, it is an ongoing activity especially when additional user base is expected due to Merger & Acquisitions, adding new business features, server consolidation, major new marketing campaign to name a few.
In either of the scenarios, the procedure for Capacity Planning starts with
√ Monitoring the Resource Utilization metrics of various Servers in the application landscape - such as CPU Utilization, Physical Memory Utilization, Disk Utilization, %Disk Busy, Network Bandwidth Usage
√ Using this monitoring data, create Trend Charts for different Concurrent Loads and identify the Hardware Resources that are either close to the Maximum limit or can become potential bottlenecks in future.
√ To forecast future hardware infrastructure, Analytical Models using QNM or Extrapolations (Linear or non-linear) can be used – as a completely built system is available either in Test or PROD environment, input data for these Models is more accurate (compared to that in Design phase) and hence the predicted results will be accurate too.
In my view, Capacity Planning (CP) exercise should be done on a completely tuned system so that the predicted results are more accurate. However, CP on a non-tuned system will help in getting the tuning effort approved by business.
Fig.5. shows the results from a Capacity Planning exercise carried out for a Credit Origination system in one of the largest Automobile companies in UK. The application provides functionality to its dealer agents starting from providing Quotes to Customers, Selling Products, and Finalizing Credit Proposals. The goal is to find out if any additional hardware is required for 2000 concurrent user load expected in next 12 months. Performance tests are conducted in Test Environment for 1/4th and 1/8th of target load and CPU, Memory and JVM Heap metrics are captured on all servers and Linear Extrapolation is applied.
Predicted CPU Utilization for 2000 user load on Portal Server and Application Server is 220% and 250% indicating that current CPU capacity (which is 2 CPUs on each server) on the Servers cannot support this load - hence additional 4 CPUs need to be added.
Here I would like to mention some best practices that can be followed to make Performance Modeling and Capacity Planning more effective.
• Performance Modeling
o Identify the objective at hand and weigh the options of Accuracy Vs Cost Vs required Time/Effort – making a Performance Model too complicated by including each and every ‘Resource’ and ‘Server’ makes it a bulky Mathematical Model which will be difficult to use for real life performance problems. Also, not every application/system requires Performance Modeling as some performance issues are very much centralized to a specific component that can be addressed using activities such as code profiling, tuning and continuous monitoring.
o Keep in mind that Performance Modeling is to find out performance bottlenecks early in the life cycle but not about predicting performance with 100% Accuracy.
o Try to model server components in isolation first and then try to club them – this will quickly help isolate the problematic component.
o For scenarios where Performance Benchmarks exist for a chosen Software from the Vendor, try to make use of them in combination with your own Performance Modeling exercise for effective decision making.
o For business processes that have frequent 3rd party interactions from your system, have stringent performance SLAs and contracts in place – Performance Modeling and prediction in such cases will be of less help (as performance of external systems is not in your control)
• Capacity Planning
o Ensure that given application is tuned to the maximum possible extent so that any possible resource contention is eliminated – this ensures that predicted capacity results are accurate.
o Always try to have minimum of 3-4 data points from Test Environment to understand application scalability pattern which will help choose between Linear and Non-Linear models for capacity forecasting.
Performance Modeling and Capacity Planning activities definitely help gain insights into application performance characteristics and help fix them early on in application lifecycle. The illustrations shared in this article highlight that QNM and Extrapolation techniques can definitely be used for live projects to predict performance with certain deviations which vary for each application/system. Due diligence should be made to figure out the applicability of these techniques for each IT system to gain the benefits of performance prediction and capacity forecasting.