CMG Home

Site Map Links Members Only National CMG Groups Measure IT International Conference

MeasureIT
 In This Issue
 
From the Editors

Articles >

Forecast Generation

I/O Virtualization

Measurement for Maturity (Part 2)

Capacity Utilisation

CMG News >

'07 Program Update

Press Release (05/31/2007)

Press Release (06/18/2007)

Region News >

Philadelphia

New York

Events >

Calendar

 Article Database
 Resources
 Industry Articles
 Submit Article
 SubscribeIT
 RemoveIT
 Letter to Editor
 About MeasureIT
 Contact Us
 
MeasureIT

Analyzing and Reporting Servers and Applications in a Linux z/Series Environment
February, 2007
by Rick Isom

About the Author
Rick Isom

Rick Isom is a capacity planner with the capacity planning group at Nationwide Insurance in Columbus, OH. He has over 35 years experience in computer operations, computer performance reporting, and capacity planning. He has published articles in Mainframe Journal, Enterprise Systems Journal, and the Journal of Computer Resource Management. He has presented several times at CMG's yearly conference. He can be reached at 614-249-6580, or .

[Hide]

[an error occurred while processing this directive]

Introduction
A major Linux virtualization project was initiated to move a large number of servers to two z/Series IFL processors. The major key to the project is the flexible infrastructure of a z/Series IFL processor which can house numerous individual Linux servers, share the resources, and greatly reduce the amount of resource waste. Unlike the typical discrete server farm environment where the utilization rate of an individual server can average as low as 5%, a Linux server running in a virtual machine can grow or shrink its consumption of available system resources based on application demand. Processor cycles that would otherwise be wasted during times of low utilization are available for other servers.

Development of timely reports to keep management and support staff informed on computer resource usage is one of the few constants in the Information Technology environment. As the Linux project evolved, and the virtualization of server farms moved to the Linux z/Series environment, the three charts that follow provided timely updates for management. Chart A shows the weekday averages for each LPAR and the maximum CPU utilization in MIPS by hour for both the current month and the previous month.

Chart A
Chart A

Chart B shows the weekday averages for each LPAR and the maximum real memory utilization by hour for both the current month and the previous month.

Chart B
Chart B

Chart C is a weekly report showing the average CPU utilization in MIPS for the entire processor during prime time for each day.

Chart C
Chart C

Server Resource Usage
In addition to providing the traditional reports to keep management informed of the Linux z/Series environment, capacity planners created different reporting approaches showing processor, application, and server resource usage for the support groups. The dynamics of a computing environment loaded with dozens of applications and hundreds of servers required some unique approaches to reporting.

One of the two z/Series IFL processors consolidating the Linux servers is used for Linux production applications. Presently, there are nearly 20 applications, but only two applications dominate the processor's CPU usage. Application T has 6 servers and application AG has 15 servers. In addition, this processor has four LPARS with nearly 100 servers, and continues to add servers. The other z/Series IFL processor is dedicated to Linux testing with nearly 200 servers. The following Chart D is a sample daily report developed covering 24 hours, and shows major resource variables summarized at the hourly level for those servers with over 25 MIPS per hour.

Chart D
Chart D

Additionally, there is another 24 hour daily report wherein the Linux servers are mapped to an application showing the same variables for those applications over 25 MIPS.

Application Resource Usage
The following report for the production z/Series IFL processor, Chart E, covers eight days and shows management the major applications and each application's CPU usage in MIPS by day during prime time.

Chart E
Chart E

Chart E shows which applications are consuming the largest percentages of CPU usage, the number of servers, and the overall MIPS per server ratio. The report also shows that the average CPU usage for the Linux applications is 1056 MIPS.

Principal Component Analysis
The Linux z/Series environment is very dynamic with dozens, or sometimes hundreds of servers on a single z/Series IFL processor. Nevertheless, with the use of principal component analysis (PCA), reliable in depth server or application analysis can be accomplished and scenarios can be generated to determine the major variables that are affecting the CPU usage of an application or a server.

Principal component analysis has the following characteristics [1]:

  • It is a variable reduction procedure that results in a relatively small number of components and makes no assumptions.
  • Components are used as the predictor variables in a multiple regression analysis.
  • An eigenvalue represents the amount of variance that is attributed to a given component.
  • The total variance is equal to the number of components.
  • Guideline for determining the number of components is that the component's eignvalue should be greater than 0.7 and the cumulative percent should account for at least 70% and sometimes 80% of the total variance.
  • There should be at least three variables present to generate a multiple regression formula.

Principal component analysis can assist capacity planners in determining the relative importance of variables in order to predict CPU usage in a Linux z/Series environment. To insure a large number of data samples, the data should be gathered and analyzed at the interval level. From a capacity planner's perspective, data provided in reports to senior management is usually more effective if summarized at the hourly level. To insure a requirement of at least fifty observations per variable, the raw data was gathered and analyzed at the ten minute interval level.

Like principal component analysis, multiple regression has been an important tool in the social sciences for many years, for example, in analyzing adult behavior data.[2] Multiple regression is a highly flexible procedure that allows researchers to address many different types of research questions with many different types of data.[3] Used together, these two data management tools can be employed to analyze data and elicit evidence of a significant relationship between the criterion variable and the multiple predictor variables.[4] In like manner, multiple regression and principal component analysis can assist a capacity planner in analyzing the behavior of a large application consisting of many servers, or a grouping of servers, running on a z/Series IFL processor.

Analyzing Servers
The following report covers five days during prime time and shows the major resource variables for the two major applications, as well as, all other applications. The statistics that follow are per server per hour. Based on the five days of data, the eigenvalues determined through principal component analysis for four variables, was high enough to be used to generate a multiple regression formula.

Principal component analysis is primarily used to determine the eigenvalues which represent the amount of variance attributable to a given component / variable. Using principal component analysis reduces the number of VM/Linux reported variables, as shown in the Linux daily reports, Chart D, for example, reduced reported variables from nine to just three or four variables. With variables reduced to three or four, it is easy to create a multi-regression formula.

The five days of prime time data were analyzed during M-F from 9am to 3pm, mapping servers to application T, AG, and all other applications on the production processor at the ten minute interval level.

The following report, Chart F, shows averages for the six variables: application T has 6 times the referenced resident frames at reset when compared to application AG. In addition, application T has 16 times more referenced resident frames at reset than all the other applications. There are nearly twenty applications in the Linux production environment.

Chart F
Chart F

After applying the PCA (principal component analysis) method, four variables have strong correlation for influencing CPU usage:

    - expanded storage frames occupied
    - referenced resident frames at reset
    - pages in
    - pages out

Application of the multiple regression formula in the following Chart G shows two models, and how the CPU usage is lowered by reducing the referenced resident frames at reset variable for application T. Although application T has six servers, after PCA variable reduction, the analysis only included two servers where 80 percent of application T's activity takes place.

Chart G
Chart G

Analyzing Applications
Since a z/Series IFL processor could have hundreds of servers representing many applications, principal component analysis assists the capacity planner in projecting a CPU forecast by measuring just the major applications.

The following reports show that after identifying seven applications out of 20 applications (previous Chart D), use of principal component analysis reduced the number of applications to four as shown in Chart H. Because of the dynamic nature of the Linux environment, principal component analysis can greatly assist a capacity planner in making a forecast for all applications, since it is generally not feasible to interview all personnel associated with 20 or more applications, nor understand all the application characteristics, and recognizes that most of the applications generally have low CPU usage.

Five days of data were analyzed during prime time, M-F from 9 am to 3 pm for all applications on the z/Series IFL production processor. Use of principal component analysis targeted seven of the applications and determined the eigenvalues were significant for four of the applications. Through use of the multiple regression formula in the following Chart H, the overall CPU usage was determined to be 1251 MIPS.

Chart H
Chart H

For timely analysis and forecasts, the capacity planner can more productively concentrate on seven applications than on 20 or more applications. For a case example: the capacity planner had regular discussions with the application AG support people and was informed that when the rest of the AG users moved to the z/Series IFL processor, the CPU usage was expected to double. Chart I reflects this increase of 460 MIPS for application AG.

Chart I
Chart I

However, in the week before the final transfer of the users, a major application software problem delayed the transfer. Let's assume that instead of application AG's CPU usage increasing to 460 MIPS, software changes before the transfer resulted in a lowered expected increase to 345 MIPS. It was also noted that the number of referenced resident frames at reset for application T was extremely high for two of the application's six servers. These two servers account for 75% of the application's CPU usage. As a result of including this data in reports sent by capacity planning to the application T support people, the high volume for referenced resident frames at reset is being investigated.

As shown in Chart G, if the referenced resident frames at reset variable could be reduced to 75,000, the CPU consumption would be reduced to 73 MIPS per the two main servers and 125 MIPS for the other four servers or, 275 MIPS for the T application. Chart J then shows a new application forecast of 1231 MIPS (versus 1401 MIPS) based on analyzing just four applications.

Chart J
Chart J

Conclusion
Capacity planning continually designs reports that present a wide range of computer resource statistics for the Linux servers on a z/Series IFL processor environment. The reports assist capacity planners in communicating application CPU usage concerns to management and to those who support the applications. These reports can initiate important discussions with the application groups and may result in the implementation of design changes to the applications which reduce prime time CPU usage. A variety of daily reports, and the use of principal component analysis enables capacity planning to identify as a variable, the referenced resident frames at reset as a major contributor to an application's high CPU usage.

In addition to principal component analysis, the use of a multiple regression procedure allows capacity planning a reasonable means to analyze and interpret large amounts of data and then generate viable application CPU forecasts. These two tools provide capacity planning efficient and effective methods to analyze a large number of variables and make useful, cost-effective suggestions to application groups that may enable them to reduce their CPU consumption and their application costs.

Principal component and multiple regression analysis allow the capacity planner innovative, self-directed approaches to reduce to a much smaller number the applications used to generate CPU forecasts within a dynamic Linux z/Series environment. Most important, these approaches may facilitate more effective use of already installed equipment, and justify a reduction in a company's computer resource expenditures or a delay in a processor upgrade.

References:
[1] Hatcher, L.H., Stepanski, E. J., "A Step-by-Step Approach to Using the SAS System for Univariate and Multivariate Statistics", SAS Institute Inc., Cary, NC, 1994, p.450-457

[2 to 4] Hatcher, L.H., Stepanski, E. J., p.381-383

 

Last Updated 02/13/07


Home | Conference | Groups | National | Members | Links | Site Map

Computer Measurement Group