CMG Home

Site Map Links Members Only National CMG Groups Measure IT International Conference

MeasureIT
 In This Issue
 
From the Editors

Articles >

Forecast Generation

I/O Virtualization

Measurement for Maturity (Part 2)

Capacity Utilisation

CMG News >

'07 Program Update

Press Release (05/31/2007)

Press Release (06/18/2007)

Region News >

Philadelphia

New York

Events >

Calendar

 Article Database
 Resources
 Industry Articles
 Submit Article
 SubscribeIT
 RemoveIT
 Letter to Editor
 About MeasureIT
 Contact Us
 
MeasureIT

Looping for Performance - A Tuning Methodology
May, 2006
by Greg Scriba

About the Author
Greg Scriba

Greg has over 25 years in data processing, and 15+ years of attending MCMG. He's held positions ranging from Operations to Performance Tuning to Capacity Planning. He has worked on OS/390, Unix and AS/400 systems. Currently he works for BMC as a Principal Software Consultant with the OS/390 performance and capacity planning product offerings.

[Hide]

In the computer performance arena we often think of "Looping" in a negative sense.  But is all looping really bad?  The loop control structure is actually one of the most often used control structures in programming today.  This paper suggests using the loop structure in our daily performance analysis to find the largest users of a system resource.  In our example we study the use of this technique as applied to identifying and reducing the CPU portion of your Total Cost of Ownership (TCO).  It can also be applied to your I/O and memory resources with similar success.  

So what is "Looping for Performance"?  It is an iterative approach that asks the question, ‘Who is the biggest and why’, at each level of a properly characterized, application workload.  Once the question is answered at one level, we then proceed to the next level and ask the question again.  We continue in this manner until we find something that we can change.  It might be a program, a DB2 plan, a parameter, or even, the best of all finds, something that no longer needs to run. 

Why do we care about finding resource excesses?  Well, in this financial environment, how often do you want to ask management for a new system?  If you do not know what is eating away at your resources how do you plan?  When resources are tight this process may find enough to delay that next upgrade or insure that it is justified.  Then again we’re all technical detectives in one way or another and this is an enjoyable use of our talents that will benefit our employers.

Silos are good for the farmers.

Let’s look now at the reason we need an iterative solution, why one stop shopping just doesn’t cut it in the new Web driven World.  Throughout the last 3 decades of computer performance analysis we have progressed far in our understanding of the platforms and subsystems we manage.  We can take pride in the stability of our systems and in our alert mechanisms when the unexpected happens.  We have developed groups of subject matter experts who know their areas of expertise to great depth.

But just when we start to get a handle on our environments, they change. Gone are the single subsystem applications.  Their replacements tend to be multi-subsystem (MQ, CICS, IMS and DB2) and even multi-platform (Windows, UNIX, Websphere).  The single stepped, vertical view into our silos, of which we have been so proud, needs to be replaced by a set of processes that can join the information we have in our silos, horizontally.  We need this horizontal view because that is how our applications are being designed and ultimately how our customers will see our companies.

How do we get to this "horizontal view"?  The first step and an old 60’s term: We become ‘One’ with our applications.  We need to see our applications for what they are: A complicated, cross platform mixture of architectures.  This may be your hardest challenge.  Your early focus should be on characterizing what is important to your organization.  Often what is important is also the largest user of CPU resources under your jurisdiction.  For this article, we’ll stick to z/OS and the subsystems of MQ, CICS, IMS, DB2 and WebSphere.  Another article could extend this topic into the distributed world.

Before we can loop effectively, we need to understand the interconnections between our subsystems and the business functions IT is chartered to deliver.  If you are new to the area, a good way to start learning about your applications is to begin a dialogue with the customer or applications area liaison.  They will know some pieces of the puzzle you have to solve. Another way is to find the transaction tables that control your internal cost accounting or billing system.  In these tables you will often find the clues to the "Business Function’s" usage of the computing environment, the applications.  Transactions in that table will be assigned to either an application, or a department or possibly a customer.  You may even find that a given transaction is used to service several clients and further clarification will be required.  Use this information to build a table like the one below, mapping the applications to the platforms and subsystems they use.


Application Knowledgebase

Application

Platforms

Subsystem Types

Servers, Subsystems Used

Manufacturing

z/OS

MQ, CICS, DB2

MQPROD,CICSPROD,DB2P

Payroll

z/OS

WebSphere,DB2DDF

WAS1, DB2W

Order Entry

z/OS, Windows

Web, DLLHOST,DB2DDF

OE45,OE46,OE47,DB2O

..

 

 

 

..

 

 

 

Putting it together - the UOW

OK, as you can see, building this picture of how an application works is not easy.  But when you’ve finished, your knowledge of your applications and your value to your employer will have taken a big jump.  Let’s use this information to build another structure, a ‘Unit of Work’ (UOW).  This term is often used to describe the round trip of a CICS transaction through several CICS regions.  But remember, we a trying hard to tie together not just one subsystem type, but the entire life of that access through our systems.  So we would like to call a UOW the work initiated as a result of the initial transaction regardless of the subsystems it visits.  The UOW ends when the initial user receives their answer.  A transaction can be initiated from an online access or through a batch job, hence, we’ll not only need to follow our applications across our online subsystems but also pull together the batch accesses of the databases.

Before we begin, we need to insure that we collect the descriptive data from all of our silos.  Consider this the ‘initialization’ of the loop counters.  You are probably using some form of performance database already, there are several: BMC Software’s Analyzer which used to be called Visualizer, CA’s NeuMICS, IBM’s Tivoli Decision Support and Barry Merrill’s MXG to name a few.  Below are the z/OS records you will need to capture:

 

A quick examination of these databases and their input source, SMF, reveals that they are bursting with horizontal data in the form of transaction names, timestamps, even the URL of a requesting PC or Server.  There is also information from the web.  The new Websphere SMF interval records can be used to correlate activity from the web.  New metrics such as Warname, Servlet and Java Bean will allow us to check on the CPU usage from these new application types. 

A simple example: A banking application that is submitted on the web enters the z/OS system through a CICS Terminal Region (TOR) visits 2 Application Owning Regions (AOR) and then executes a DB2 plan.  Much of the CPU usage for this application is charged to the CICS Region it executed in.  The CICS transaction’s usage can be provided for by the SMF 110 record, the MAINVIEW CICS detail record, Landmark’s TMON or Omegamon CICS.  DB2 usage can be provided by the DB2 Accounting record, the 101.  Recent versions of the CICS Monitoring Facility (CMF) from IBM will provide the total DB2 usage in the 110 record, but that’s as far as they go.  If you want more detail on the ‘Packages’ of  SQL code you will have to pull in the 101 record also, and have the Accounting Traces 7 and 8 turned on.

The chart below depicts our Banking Application:         

To tie the records together you can use a CICS variable called the Unit of Work timestamp. 

The UOW timestamp facility is automatically turned on in most CICS multi region subsystems.  However, the timestamp is not automatically passed from CICS to DB2 where it is recorded in the 101 accounting trace record.  To do so requires setting the ACCOUNTSREC parameter in the RDO to either TASK or UOW in CICS 5.1 or above.  In CICS 4.1 it was referred to as the "Tokeni" parameter.

This is just one of many different UOW that can be created.  Some easy like this one, some more tenuous where you will be using the transaction name and the time processed to create the UOW.

On the batch side there is a reasonably good link between the total address space usage found in the type 30 record and the DB2 101 record where the batch jobname is the Correlation ID.  This gives you the ability to understand the DB2 part of that jobs usage.

Analyzing the data

Now that we have our data gathered and have a little understanding of our applications, let’s start "Looping for Performance".

We start at the highest level of usage on the system, the Service Class, assuming you are "Goal Mode".  The type 72 records viewed with Visualizer, SAS or even RMF reports can tell you where to start.  Just ask this question for your peak ‘Prime Shift’ hour: What’s the biggest one and Why?

Why just the biggest ones: Time.  You do not have time to fight all the small battles, you need some successes to show value.  During WWII the allies in the Pacific had a plan called "Island Hopping".  They focused on their "Critical Path" - Those islands that truly determined the course of the war.  They spent their time and effort on those, bypassing the others.  The result: Victory.  Today, in your day to day performance battles you can employ "Silo Hopping".

1. What’s the biggest service class:  A review of the type 72 records tells us that the "Online" service class consumes the most CPU for the 2:00 PM hour, our peak.

OK, one question answered. We need to focus on the "Online" service class.  Let’s drop down a level and ask our question again:

2. What address space which runs under the "Online" Service Class consumes the most CPU?  Here an analysis of the type 30 interval records can help.  We find that the CICSPRD4 region has the most CPU time.  Again we need to know why this region is so heavily used.  So we drop down another level, down to the transactions and ask our question again.

3. What transaction in the CICSPRD4 region consumes the most CPU?  Here is where we run into a little trouble.  We are running CICS Online Transaction Server 1.3. The sum of all the CPU used by the transactions equates to only 20% of the CPU time the region has used.  What’s the problem?  The problem is this CICS version’s 110 record does not reflect the DB2 portion of the transaction.  The CICS region gets charged the CPU cost of the DB2 plan that was called but not until CICS OTS 2.2 does the 110 transaction record provide this information.

To bring in this DB2 accounting information we need to use our UOW timestamp. After doing so we find that DB2 plan BM10D17 was used and when combined with the CICS transaction BM10 which called it, it is by far the largest user of CPU resources within our CICS Region.

Let’s recap what we just did:  Started at the Service Class and just kept asking what’s big and why.  We used the already available SMF records and possibly a performance database and/or an application data analysis tool to get to the DB2 Plan and or package causing the usage.

Where do we go from here?  Crossing the silos we go talk to the DBA who is familiar with this application.  The DBAs have some tools from IBM, BMC, CA or others that can "Explain" our BM10D17 plan.  It so happens that upon closer examination, the BM10D17 plan is returning over 100,000 rows per ‘interactive’ transaction.  The plan was also scanning a very large table.  A quick addition of a field to the index on the table produced a 30 percent reduction in the CPU used by this transaction.  It was also found that only one of the 100,000 returned rows was used.  This was an obvious plea for further scrutiny where some DB2 query magic could possibly eliminate the search through those 100,000 rows.

This transaction caused 100 MIPS of usage on one of the 7 LPARs it was found operating on.  The 30 MIPS of savings will be the smallest savings as the change is applied to the other LPARs running this application.  All other LPARs are bigger and the application runs more transactions on those other LPARs than the one used in the example. The total savings will be at least 200 MIPs or about one 2064 engine.  There is even a larger savings to be had as soon as the return of the 100,000 rows is clarified.

Summary

We have introduced a simple method of "Looping" through the SMF data to find our "Critical Path" to victory.   There is nothing new about the record types used or the data contained in them.  What is new is a means of quickly getting through the data and turning it into information.  Also new is our view of our application from a wider scope then what we have used in much of our past.  The horizontal view of an application is critical to our success.  Future enhancements to this process will need to take into account the outboard systems such as Windows and UNIX and new environments such as Websphere.  The additions of these new data sources will again require new metric knowledge and new changes to the UOW.  However, we are already part of the way there.  We know what question to ask:  Who’s the biggest and Why?

 

Last Updated 06/05/09


Home | Conference | Groups | National | Members | Links | Site Map

Computer Measurement Group