A Tiered Approach to Performance Engineering

April, 2009
by Robert Jahn

About the Author
Robert Jahn

Robert Jahn is a Principal Performance Architect at Collaborative Consulting, LLC, a business and technology consulting firm headquartered in Boston, Massachusetts. Collaborative Consulting supports business needs efficiently and effectively by providing companies with the flexibility to allow for clear choices regarding time, money and system constraints, and even less tangible pressures such as politics, personal preferences, vendor relationships and other factors.

Robert has 17 years experience in project management, systems analysis and software development. He is also a certified project management professional (PMP) and has led numerous performance application assessments and testing projects for Fortune 500 companies in the North East. He holds a BS in Computer Science and MS in Engineering from Lehigh University.

Summary

This article outlines a proven technique for performance characterization within enterprise applications. This technique uses a project-based approach that applies increasing levels or tiers of analysis to meet business goals.  It advocates up-front scoping of deliverables and the clarification and setting of project expectations. Furthermore, the outlined approach can be used at any phase of the software development life cycle to provide factual data for systems and business planning.  The tiered approach outlined in this article expands upon the Performance Engineering methodology used by Collaborative Consulting using insights gained from the author’s real-world experience.

The Challenge

Many IT professionals face the challenge of answering business questions such as:

          What is the total capacity we can support on the existing infrastructure?

          What is causing the error rate and performance degradation trends we are seeing?

          What additional hardware, configuration changes, or application changes are required to keep system stable and scalable?

          How do we address frequent and inexplicable application outages?

          Will performance be maintained following the upcoming upgrades, upcoming client implementation, and business acquisition?

Despite the clear need for performance management activities earlier in the software development life cycle, the reality is that many applications go into production with little to no performance testing or system characterization. This is due to a variety of reasons such as time-to-market pressures, short term budget constraints, unplanned issues with new technologies, or just lack of planning. 

Furthermore, once in production these same systems grow in complexity as new enhancements and new technologies are introduced, further complicating the path to answering the business questions outlined above.  As a result, many of the key inputs to answering these questions are incomplete or missing at the time business questions are asked.  These inputs include:

          Clearly defined service levels

          Technical design documents

          Monitoring tools and historical data

          Business requirements

          Subject Matter Experts

          Automation test engineers

          Performance test environment

So, what should we do and where do we begin?

Tiered Approach

In order to meet the goals and create the deliverables, I propose an approach that employs increasing Tiers of Analysis.  This approach allows for proper planning and allocation of the right resources as more information is gathered and analyzed.  The word “tier” is used rather than “phase” to denote both their relative level of effort and invasiveness and to denote that project objectives could be met by any Tier analysis depending on the goals. 

Figure 1 below summarizes the three tiers detailed in the next sections of this article.

Figure 1

 

What is the goal?

It is important to first define the project goals and ultimate deliverables before the effort begins.  This way, the risks and costs can be clearly understood and expectations can be set before the effort begins.  A Project Charter is a good vehicle to document scope, budget, and resources required. 

Figure 2 below outlines typical deliverables grouped left to right in terms of tactical to strategic.

Figure 2

Typically a Senior Performance Engineer (SPE) leads these activities and leverages a virtual team of technical experts from operations, development, and operations.  In order to ensure success, the SPE needs to be politically empowered to engage other teams and drive project tasks.  For larger efforts or in organizations with distributed functional teams, a project manager may also be needed to coordinate and engage the various functional teams. 

Tier One – Observe & Analyze

The Tier One effort aims to observe and analyze the current state of the application and processes as well as gather information that may not be documented.  Tier one activities are the least invasive to the system as nothing other than observation and analysis is performed. 

Most of the time and effort will be spent gathering and reviewing a lot of information from various sources.  Typically, the information spans departments and even organizations.  Figure 3 below outlines the assessment inputs into business and system sources and is followed by examples for each.

Figure 3

Below are descriptions for the inputs within figure 3.

Business Inputs

 

System Requirements

 

Business Service Levels (Internal & External

Business and System Constraints

 

Business Forecasts

 

·          All information related to data volumes

·          Availability

·          Response time

·          Batch cycle times

·          Communications of system go-live date

·          Planned system outages Business regulatory requirements

·          Technical support windows

·          Business commitments regarding future activities

·          New service offerings

·          Business activity growth

System Inputs

 

Historical Data

 

System Characterization

 

Architectural diagrams and process flows

 

System constraints

·          Number of transactions and type, CPU and memory usage, network traffic, disk space, trends, performance

·          Performance – throughput, run-time

·          Resource Utilization – amount of CPU/Memory required to process work

·          Scalability – what happens to system resources under increasing load?

·          Definition of current and proposed environments in terms of logical and physical perspectives

 

·          Other IT activities underway or planned that have influence on the key input areas for system under change

One approach to deal with information overload is to break down the information into categories and prioritize it based on its benefit to the assessment and the level of effort to get the information.  Any information or access to systems that is deemed critical to project success must be put into the risk management plan and brought to the attention of the project sponsors.  Figure 4 below depicts one classification scheme.

Figure 4

 

To recap, Table 1 below is a summary of the keys tasks for the Tier one effort.

Table 1

Tier One Keys Tasks

  1. Define scope
    • Get an overview of the issue and business goals
    • Define deliverables
    • Identify key personnel to work with
    • Define environments, processes and systems in scope

  2. Gather and review information
    • Business requirements, forecasts, service levels, process flows
    • System architecture, configuration, hardware, historical logs

  3. Review testing “current state” (i.e. do they have)
    • Application testing infrastructure
    • Predictive models
    • Performance engineers, tools, testing methodology
    • Scripts and test data

  4. Document observations

  5. Define strategy for Tier Two effort (i.e. Resources, Time, Test Plan)

Tier Two: Observe and Locate

The Tier Two effort seeks to reproduce the problems in order to locate issues.  In order to do this, a workload model first must be defined; if it was not already created by the Tier One effort.  The model will serve as input for the development of performance test scenarios.  Table 2 below is a summary of the keys tasks for the Tier Two effort.

Table 2

Tier Two Keys Tasks

  1. Analysis
    • Develop initial workload model
    • Perform code analysis and code profiling

  2. Test Planning
    • Define test case scenarios (loads, data, functionality)
    • Define environment and application configuration requirements
    • Define verification steps and activities
    • Define measurement requirements
    • Define expected outcomes
    • Identify testing constraints

  3. Environment setup
    • Set up databases: refresh, data load
    • Set up infrastructure component
    • Set up monitoring and analysis tools

  4. Test data and test scripts
    • Identify user logons and reset passwords
    • Develop and smoke test automated scripts
    • Test each user logon and ensure they pass the use cases

  5. Testing
    • Smoke test environment and review that all monitoring and logging are enabled and that automated scripts function properly
    • Schedule time with other groups and run tests
    • Monitor tests and verify successful completion
    • Perform characterization testing
    • Aggregate test results, logs, traces, monitor output

  6. Testing analysis
    • Analysis of test results
    • Publish test results and findings

  7. Implement improvements

  8. Validate and document improvements

As in the tier one effort, it is critical to define goals, document the requirements and determine the resources required to complete the work.  Also, before a test is run, the testing complexities, constraints, and risks must be identified and reviewed with project sponsors.  For example:

  • Is suitable hardware available?
  • Is the network bandwidth appropriate?
  • Are there licensing requirements?
  • Where to execute tests? Client-site? 3rd party lab?
  • When does the business need results? Are these realistic timelines?

A questionnaire can be an effective way to gather requirements from the business.  These operational needs or business drivers are the critical starting point for developing the overall performance and capacity goals and application workload model.   Table 3 outlines topics to include in a business questionnaire.

 

Table 3

Business Questionnaire Topics to review

Application End User Populations

Understanding the number and type of end users is important since it relates to transactions they perform and how often they perform them.

  • Break out counts by role
  • Forecast Growth Rate or other anticipated changes

Participant Populations

Knowing these volumes and their breakout may be relevant to transactions such as extracts, reports, and on-line searches.

  • Break out population counts by role
  • Forecast Growth Rate or other anticipated changes

Business Data

Knowing these volumes and breakout may be relevant to transactions such as extracts, reports, and on-line searches.

  • Break out counts or ratios by Key Business Entity e.g. Number of Orders, Orders Per Customer
  • Forecast growth rate or other anticipated changes

Transaction Volumes by Role

Performance and Capacity is concerned with the high volume and business critical functions. For example, some admin functions are very low volume such as ten times a day but search may be several thousand per hour.

  • Break out by time of day
  • Break out transaction mix or use cases by role
  • Break out length and frequency of user sessions
  • Break out of transaction counts by sessions (e.g. reports, searches, approvals)

Forecast Growth Rate or other anticipated Changes

Provide the calendar of seasonal or business cycle events (e.g. What will happen on day of Launch, what happens during a review cycle). Also, provide rollout schedule for end users (date, role, counts)

Batch Jobs

Name, Frequency, Start and Stop Times service levels

Interface Files

Name, Frequency, Start and Stop Times service levels

Service Levels / Response Time Targets

List any internal or external service levels or performance requirements.

  • Break out by Business Hours versus non-Business Hours
  • Break out by page type such as login, homepage, report

Other Critical Business Functions with Performance Requirements

Function may not necessarily be high volume, but is critical.

Tier Three: Analysis & Define Alternatives

Whereas the Tier Two effort aims to locate the issue, Tier Three aims to address the issue.  To do this, alternative solutions must be defined and benchmarked. Alternatives can include coding changes, infrastructure changes, configuration changes, or a combination of both.  Benchmarking is then an iterative process of testing and analyzing alternatives until an acceptable result is achieved.  Table 4 below is a summary of the keys tasks for Tier Three effort.

Table 4

Tier Three Key Tasks

  1. Analysis and define alternatives to benchmark

  2. Business Review

  3. Project Planning

  4. Repeat Tier Two activities

  5. Refine workload models

  6. Implementation Planning of phased rollouts

Parting Thoughts

Know that performance assessments can be very exciting because there is much to learn and having access to a wide range of people is a rare opportunity. On the other hand, there is also a lot of pressure on the assessment team to deliver a solution when senior management has made the project high priority.  The key to success then is to use a methodical, requirements and project approach as outlined in this article.