As a system programmer, you look at RMF and might wonder why DB2 consumes so much CPU or issues so many I/O requests. Unfortunately, RMF won’t tell you the reason but DB2 traces do. Many users collect DB2 traces to SMF as DB2 for monitoring but only a few are utilizing the wealth of information recorded to go further. This session covers what to measure, key fields that you need to monitor for both CPU and throughput, basic steps of DB2 system performance tuning and how to identify opportunity of zIIP offload based on measurement data from Lab and field support.
9177 - Management and Reporting
The SAP license management process facilitates managing, tracking and distributing SAP Licenses while ensuring contractual compliance and cost optimization. With different price list and user categories it’s important that the organization understands the different license types available based on the contract with SAP and have clear guidelines on distributing licenses based on usage profile. This paper outlines the authors experience in managing the measurement program on different versions of SAP for large organizations.
The IBM System z10™ mainframe provides an architectural framework for 1MB virtual pages. This paper reviews computer architecture trends and describes various implementations of large pages across the industry to explain why large pages are needed in today’s operating system environments. It also reports results from work focused on which workloads are subject to the largest performance boost from the use of large pages and how large-page support has been implemented on the System z10 platform and in the z/OS operating system. Results from recent performance tests are presented and analyzed.
Millisecond latency can mean millions of dollars over time for some applicaitons. However, before you can manage and reduce latency, you must first measure it. This involves synchronizing all processes involved, hardware and software, and precisely time tagging the data exchanges. The presentation examines the various technologies available to address synchronization and how they can be applied. Examples will include simple relative time tagging within a single data center to worldwide absolute sync to Universal Coordinated Time across multiple centers with sub microsecond accuracy.
The cost of hardware is trending down while other costs are rising rapidly. The result is that cost saving opportunities are shrinking while the analysis takes increasingly more time, effort and money. This panel of world-renowned experts in application and systems modeling will candidly discuss this and other questions related to the future of modeling as a tool to achieve business objectives. Topics: How close is good enough? Business vs. Math: What’s the trade-off? How does complexity affect the value? Is modeling headed to the clouds? What’s driving costs du jour?
9173 - Management and Reporting
With the advent of HiperDispatch and specialty engines such as zIIPs and zAAPs, performance analysts need to update how they interpret processor related measurements. During this session the speaker will provide updates and insights to the latest z/OS processor measurement. Areas discussed will include LPAR and workload measurements relating to HiperDispatch, as well as measuring zIIPs, zAAPs, CPs, and ICF processors.
When analyzing DASD I/O performance, many performance analysts concentrate on looking at I/O measurements from a logical volume and control unit point-of-view. However, it is always useful to flip the viewpoint around and examine I/O performance from a workload point-of-view to better understand and correlate the effects that poor logical volume performance has on individual workloads. During this session the speaker will provide an approach for examining I/O performance from a workload point-of-view.
The cost of hardware is trending down while other costs are rising rapidly. The result is that cost saving opportunities are shrinking while the analysis takes increasingly more time, effort and money. This panel of world-renowned experts in application and systems modeling will candidly discuss this and other questions related to the future of modeling as a tool to achieve business objectives. Topics: How close is good enough? Business vs. Math: What’s the trade-off? How does complexity affect the value? Is modeling headed to the clouds? What’s driving costs du jour?
9170 - Management and Reporting
Are you challenged deciding which platform to host your next application? Are you constantly asked for a simple platform selection flow chart? Has it become harder to technically position different platforms? While platform selection will always client specific, this session will give you a starting point for mapping applications to key characteristics of servers and a place to start the discussion. It will included exploration of key deployment models, the role of virtualization, the impact of non-functional requirements along with workload characteristics and their associated trade-offs.
9169 - Performance Engineering and Load Testing
The size and complexity of software has increased tremendously and it is common to have applications built with millions of lines of code. Similarly, quality concerns of the released product are also increasing. Moreover, adhoc approach of testing is still very common in the industry. The engineering approach I have chosen for performance testing is called the “Measurement technique”. A systematic engineering approach to performance testing has 3 main aspects: Planning and testcase design Execution of testcases Analysis of data and reporting.
A study to find out where, when and how to use different variation of cursors in PLSQL coding to improve Performance based on comparative study done in case study which consists of variation of cursors usage in place of normal selects.
Trading volumes today are rising by more than an order of magnitude. Next generation trading systems are expected to process more than a million orders per second with end to end latencies of less than one millisecond. A messaging framework is the backbone of a trading system. It enables Communication between various processes comprising a trading application. Hence it is essential that the messaging platform is capable of meeting more than just the end to end throughput and latency requirements. In this paper the authors share their experiences towards building of such a messaging solution.
If you have a System/z performance question, this is the panel to ask. Some of the many performance related questions the panel of experts can answer include: System/z processors, processor configurations, general Sysplex, z/OS system performance, WLM anything, variable Workload License Charges, WebSphere, etc... Come prepared with questions, email them as soon as you can to zos_panel@cmg.org, or drop a written question into the Q&A box you will find at various z/OS track sessions, or hand your written questions to any z/OS session monitor.
If you have a DB2 question regarding performance tips, functionality, insights, or DB2 special features such as stored procedures, zIIPs,version 8 conversion and version 9 performance, then this is THE panel to ask. The members will be able to answer your questions and enlighten you with their valuable expertise. Come prepared with questions that affect your work life; email them as soon as you can to db2_panel@cmg.org, or drop a written question into the Q&A box you will find at various z/OS track sessions, or hand your written questions to any z/OS session monitor.
9164 - Performance Engineering and Load Testing
In this presentation we describe how to use Capacity Planning models to extrapolate Load Testing results from test environments to production environments. Indeed, Load Testing by definition is performed in a dedicated and isolated environment, without any performance interactions with other services. On the other hand, production environments have different hardware configurations and can leverage virtualization technologies to consolidate different services, therefore creating performance interactions between different services.
9163 - Performance Engineering and Load Testing
The Storage Performance Council (SPC) is a cross-vendor team of storage performance experts that has built the industry’s first benchmarks for storage. These benchmarks are the standard for decision making in many organizations. The SPC has used real-world workloads as the basis of the benchmarks that are vendor-neutral and platform independent. Many SPC-1, SPC-2 and SPC-C (component benchmark) results have been published to date. This panel session will discuss the status of the SPC and the benchmarks available and under development including SPC-3 and the Energy Extension to SPC-C.
Trading volumes today are rising by more than an order of magnitude. Next generation trading systems are expected to process more than a million orders per second with end to end latencies of less than one millisecond. A messaging framework is the backbone of a trading system. It enables Communication between various processes comprising a trading application. Hence it is essential that the messaging platform is capable of meeting more than just the end to end throughput and latency requirements. In this paper the authors share their experiences towards building of such a messaging solution.
In this paper we consider issues that may arise in the I/O workload characterization for Parallel Access Volumes. While for a single exposure characterization of I/O device service time limited to the mean and the standard deviation is sufficient to correctly predict the average I/O time in a realistic model, things are less simple when one considers multiple parallel servers.
Many CMG members identify themselves as IT performance and capacity professionals. Ultimately the objective of our work is to make the organization’s IT systems more reliable. Today we are much to focused on IT resource consumption metrics, which may be the “vital signs” of IT systems, but this is kind of like checking a patient’s pulse and respiratory rate and saying that they are healthy enough even though the patient may be suffering from a terminal disease. The lights may be on, but are the IT systems you manage perceived as reliable by the end users and the business?
9159 - Performance Engineering and Load Testing
This community discussion will cover the best and worst practices of how software performance engineering is organized within a software development team. Discussion areas will include management requirements, budgeting and communicating ROI, staff composition, charter, methodology and tools.
With the increasing cost of energy and the increasing concern about global warming, people are beginning to consider energy efficiency as a factor in server upgrade decisions. This paper describes a simple yet accurate consumption model for servers that makes use of hardware performance counters and public available experimental measurements. The model relates processor utilization, disk activity, and system configuration into real-time predictions of the power consumption.
This presentation is all about some of the statistical techniques that are useful in capacity management, so long as they are used with understanding. Leptokurtosis and litotes were selected as two particularly abstruse words from the repertoire of statistics and syntactics respectively. They are respectively simply ‘peakedness’ or ‘peakiness’ and ‘implying the positive by denying the negative’. This paper discusses some of the statistical techniques used by capacity managers and the dangers therein, with particular emphasis on the choice of metrics and the means of graphical presentation.
9156 - Performance Engineering and Load Testing
Performance crisis are expensive, stressful and associated hot-fixes often may result in sub-optimal solution, with the significant risk of project failure. The severity of the crisis is usually greater, the later the issue is discovered in the application life-cycle. The Performance crisis prevention approach, introduced in the Portal domain projects of the Telecom company I work for, via a formal but flexible System Performance Engineering (SPE) process is the topic of this paper. Preliminary results and challenges in two pilot projects are also included.
SPEC benchmarks are often used to compare the relative performance of servers. Typical area of applications are datacenter consolidation and what-if analysis. Unfortunately, each server comes with different configurations (e.g., number of processors, memory size) while the published SPEC results are available only for a small subset of the configurations, typically the most powerful. The problem we are trying to solve in this work is how to scale down or up a SPEC result in order to account for different number of processors with respect to the benchmarked configuration.
9154 - Performance Engineering and Load Testing
Performance crisis are expensive, stressful and associated hot-fixes often may result in sub-optimal solution, with the significant risk of project failure. The severity of the crisis is usually greater, the later the issue is discovered in the application life-cycle. The Performance crisis prevention approach, introduced in the Swisscom’s Portal domain projects, via a formal but flexible System Performance Engineering (SPE) process is the topic of this paper. Preliminary results and challenges in two pilot projects are also included.
Transaction Aware Performance Modelling is a technique that allows us to analyse a transaction processing system in more detail producing performance metrics for hundreds of individual transaction types. Without this technique computational limitations prevents us from modeling information systems in such detail. The presented modeling technique meets this challenge. This presentation gives an overview, a focus on the corresponding measurement techniques and a brief overview of some results.
Trading volumes today are rising by more than an order of magnitude. Next generation trading systems are expected to process more than a million orders per second with end to end latencies of less than one millisecond. A messaging framework is the backbone of a trading system. It enables Communication between various processes comprising a trading application. Hence it is essential that the messaging platform is capable of meeting more than just the end to end throughput and latency requirements. In this paper the authors share their experiences towards building of such a messaging solution.
Everything you thought you knew about Cloud Computing and then some! This tutorial will look at this ’hot topic’, even though it has a long history. We’ll go over some highlights and advantages of going "cloud" ... and we’ll also look closely at what it takes - especially in terms of systems management - to keep the "cloud" in the air! With any luck, you’ll be armed with the right weapons to take on cloud computing correctly after attending this most entertaining session.
9150 - Management and Reporting
IT costs are becoming a major component of organizations today as the enterprise increasingly relies on IT systems and infrastructure to conduct their business activities and interact with its consumers and business partners. As a result of the significant size of the IT budget, there is an increasing focus on the cost and effectiveness of this aspect of the IT investment. This paper looks at the how the IT organization can justify and implement on-going cost reduction initiatives that can be maintained in their enterprise through the development of a Sustainable Cost Reduction Strategy
9149 - Performance Engineering and Load Testing
Performance testing forms a fundamental part of any organization’s software development lifecycle. Running a test is only a small part of the process. Understanding the workload mix, the data flow and performing risk assessment is also essential. Once tests have been run and data collected it is also extremely vital to properly analyze and present it with clarity. This paper will list out guidelines that could be followed to successfully run performance tests. It will talk about creating better test plans, data analysis and reporting to find bottlenecks that lie in the application.
For application performance and capacity management, an enormous amount of data is collected for known and unknown reasons. How to weed through, analyze, and use the data effectively has been a challenging issue for both performance tool vendors and IT organizations. In this paper, we focus on performance metric patterns and their performance implications. Recognizing those patterns can help us better classify performance issues and deal with them efficiently.
9147 - Performance Engineering and Load Testing
Goals: first, to run benchmark tests on popular, diverse, accessible systems (eg. iPhone, z-Series, Blackberry, Linux, Windows, Clusters); second, to provide guidance with respect to suggesting processor ratings that may help in the selection of a target platform; third, to submit our results to a worldwide supercomputer ranking body. The Challenges: first, to create a common metric that allows comparisons of different processors and platforms. Second, to create very similar workloads for the different machines where identical workloads are just impossible to create.
9146 - Performance Engineering and Load Testing
This paper presents a performance model interoperability framework that brings together performance model interchange formats and experiment specifications with the automatic generation of performance analysis results for presentation and publication. We present a standard approach to define an experiment consisting of a set of model runs and the output desired from them. We also present a mechanism for automatically transforming the tool output into useful results. A proof of concept example demonstrates the framework.
9145 - Management and Reporting
A comprehensive performance monitoring strategy that is simple and cost-effective to implement is quite critical in managing application performance. With an effective analytical engine, this can also help correlate between business, application & infrastructure views and thus help justify value to respective stakeholders. It can also be used as a means to plan capacity for future growth, expansion & mergers. This paper presents the need and the design of a lean and extensible performance monitoring framework for eBusiness applications in a large enterprise through a case study.
IT infrastructure capacity planning identifies key performance metrics of the service/s hosted on the IT infrastructure and correlates business goals to arrive at infrastructure needs. Most often this involves statistical and graphical trending and analysis. The gist of these graphs then needs to be conveyed to the management. This paper compiles a few “curves” that any capacity management exercise goes through and then derives how best these can be reported in management summary reports; aiming to provide an introductory view to performance analysis and capacity management.
Since 2004, we provide a Saas Style Internet Banking System with the Japanese banks. The system provides services for 130 large and small scale banks, and 60,000 users with more than 300,000 page views per day on single server. Lead by its increasing users we have faced performance problems. Though having difficulty in reproducing the system load, we needed to carry out a capacity planning in any way. This paper is about how we carried out a capacity plan by analyzing factors which caused performance problems in the past. We are now able to prevent performance problems proactively.
A virtualization maturity assessment can provide a scalar to show at a cusotmer or application level the workloads that would benefit the most from virtualization. Aft erhte maturity assessment, there are many homegrown tools created to support p, x, and zSeries migrations. Some concentrate on the physical virtualization and some concentrate on procesor utilization. This paper shows a case study from assessment through virtualization tooling application to actual virtualization migration.
Cloud computing requires that capacity planners plan in advance of the procurement cycle. Given that new customers could be added into the cloud at any moment, this cloud computing capacity planning algorithm forecasts the usage and the amount of required additional hardware resources. The simulation predicts the amount of resources that are required to achieve the utilization targets. This algorithm was used successfully to project equipment provisioning for the on-demand computing farm and projected monthly requirements for just-in-time equipment provisioning.
9140 - Performance Engineering and Load Testing
It should come as no surprise that, when it comes to performance, the software industry is in a pretty sorry state. Many software systems must go through an expensive and time-consuming tuning process before they can be used. Others must simply be abandoned. This tutorial presents a systematic, quantitative approach for cost-effectively building performance into software systems. It provides an overview of Software Performance Engineering (SPE) and illustrates the steps in the SPE process. A case study illustrates the SPE models and their solution.
This short talk describes how the in-depth performance tools available in Solaris were used to identify a performance mystery that was causing an application to run in 20 minutes instead of the expected 10 minutes.
Cloud computing is maturing, becoming a viable alternative to classic on-premise IT. Cloud facilitates lower fixed and variable costs while supporting enterprise growth. The cost savings are primarily due to reduced energy footprint and on-demand infrastructure provisioning. The benefits are compelling; however, a quantitative analysis is required. This paper describes a methodology for predicting performance, energy and cost for expanding on-premise IT into the Cloud. Our case study demonstrates how to quantify the effects of leveraging Cloud computing for scalability and energy efficiency.
By offering the potential of revolutionary savings in capacity requirements for backup, deduplication technology has become a major new focus for storage administrators or others concerned with the management of server environments. This paper gives an overview of deduplication effectiveness, as seen in early experiences with a specific deduplication product. In addition, we compare such experiences with the Storage Performance Council-3BR benchmark, whose goal is to provide a realistic testbed for exploring backup/restore performance.
At the Dutch Tax Office an Enterprise Service Bus (ESB) is being developed on System z. In order to secure performance, System Performance Engineering was applied. For this we created an mBrace Transaction Aware Performance Model to study the performance of combinations of message types. Consequently this model needs to be filled with appropriate data. Since no adequate measurement facilities are available I developed the ’ASCB-tool’ on z/OS. In this presentation I will outline the applied mBrace method, the ASCB-tool and show the results from the study on the performance of the ESB.
9135 - Performance Engineering and Load Testing
The presentation describes one performance engineering project in the chronological order. The product under investigation was a three-tier Java application suggesting the best offer to a customer according to provided criteria. The performance issues found turned out to be database-related. PerfMon was used for initial monitoring, some aspects of Microsoft SQL Server and Oracle Database monitoring with PerfMonitor are discussed further.
9134 - Management and Reporting
Tracking data through the physical computing environment used to be fairly straight-forward. There was a simple 1:1 relationship between data flow and computing resources. However, this changed when advanced storage technologies came onto the scene. Today’s monitoring tools should report on virtualized LUNs, tiered subsystems, snapshots, replication, data deduplication, virtual tape libraries, storage security, and many other advancements. This session will discuss the evolution of storage monitoring and examine emerging technologies that will need to be tracked, managed, and reported.
In this presentation, we will look at capacity planning for servers that use Oracle databases. Oracle provides several tools that can be used to determine current usage as well gather data for trend analysis and forecasting. Key activities involved with Oracle capacity planning includes: * How much system resources to allocate to the database server? * Sizing of various objects * Oracle overhead for running PL/SQL, triggers etc. * Determining the number, size and contents of tablespaces * How much growth factor to account for? * How much data to keep? How long to keep it for?
As part of capacity planning practices, IT professional need to identify KPIs and SLAs for green data center’s mission critical applications and services. Green data centers don’t just save energy, they also reduce the need for expensive infrastructure and High-density servers upgrades to deal with increased power and cooling demands as part of capacity planning and performance management. This tutorial teaches the attendees about green data centers infrastructure capacity planning, services, SLAs and KPIs to maintain the SLA defined for the workloads.
9131 - Management and Reporting
This paper talks about the Microsoft Windows Service Architecture in full length.
Communication is the key to healthy relationships between threads and the kernel; signals are the means they use to communicate. This paper discusses the Linux Signal Handling Model in details. Signals are used to notify a process or thread of a particular event. Many computer science researchers compare signals with hardware interrupts, which occur when a hardware subsystem, such as a disk I/O (input/output) interface, generates an interrupt to a processor when the I/O completes.
9129 - Management and Reporting
This article discusses about the Windows Event Handling Mechanism. I tried to cover all aspects of the Windows Event handling mechanism in this paper. This paper talks about how to generate events and different data structures Windows provide for programming purpose. The code has been written in C/C++ and is well documented. Part of the code is also read in WMI scripts also. This document gives a well oriented details about this subject matter.
With real-life case studies from financial and non-financial sectors, this new paper discusses the challenges of achieving real capacity planning with a minimum of available resources in this particular economic climate. The focus of the paper will be on automating the process without sacrificing the quality and accuracy of the results. A range of technologies and applications will be featured using the latest performance analysis and capacity planning techniques.
This paper describes a methodology for transforming from a single/app per server environment to a shared resource environment. Included are discussions about how to determine resource requirements, including factoring up processor overhead associated with higher IO and Network consumption, as well as qualification criteria which take into account consolidation ratios, floor space objectives, energy consumption objectives, and more. Non technical factors that influence server configurations and consolidation ratios as well as changes to current processes will also be addressed.
Two of the newer UNIX/Linux filesystems that aim at addressing today’s vast IO (SAN) challenges are SUN’s ZFS and the Linux filesystem Btrfs. The goal of this paper is threefold. 1st, to introduce the design, architecture, and features of Btrfs and ZFS. 2nd, to compare the 2 filesystem architectures and to elaborate on some of the key "performance by design" concepts that are embedded into the frameworks. 3d, to conduct an actual empirical analysis, comparing the performance behavior of Btrfs and ZFS under varying workload conditions, utilizing an identical HW/SAN setup for all the benchmarks.
9125 - Performance Engineering and Load Testing
This paper deals with examining the Effective Usage of Oracle Data Warehouse with respect to performance and High Availability by using Oracle 11g and Oracle High Availability Architecture. We will find which the new features in Oracle that can boost Performance and Availability of existing Data Warehouse either on 11g or on lower version which have plans to upgrade them but stuck to take the decisions.
Like many enterprises, at BMO Financial Group we are migrating most of our IBM Mainframe processing onto the latest generation, System z10. As we migrate workload onto the newer machines we run tests to compare our actual performance with ratings from IBM (ITRR) and Gartner. We also did this when we migrated from z900 to z990 and from z990 to System z9. This paper will explore our actual performance measurements and their implications, discuss issues of how to rate footprints, and delve into whether and how to use such ratings for performance management & capacity planning purposes.
9123 - Management and Reporting
This would be a series of 3 half hour lectures (after/during breakfast and before the first session- Tuesday through Thurs). To teach attendees stress coping mechanisms. Enabling them to deal with health issues (high blood pressure, weight etc) and stress will enhance their overall performance at work and home.
9122 - Management and Reporting
The economic turmoil has had a serious impact on enterprise IT. Businesses across the globe are taking action to reduce cost and improve efficiencies, but the challenge of effectively managing IT with reduced headcount and budgets is daunting. Business Transaction Management (BTM) can address these challenges. Attendees will learn how to use (BTM) to: - Boost business activity using existing resources - Avoid outages and improve IT management efficiency - Reduce total cost of ownership (TCO) of applications, servers and management tools - Expedite the adoption of shared services
9121 - Performance Engineering and Load Testing
This workshop introduces major concepts and principles of performance testing. The major topics include: • Definitions of performance testing • Performance requirements and SLAs • Different types of performance tests. • Different types of performance testing tools. • Guidelines for the selection of tools for different types of performance testing. • Planning and management of performance testing • Hands-on exercises with easily available tools and utilities • Test planning exercises with EXCEL templates
Today’s IT design and deployment of new applications is dominated by managed application environments. These complex and business critical applications often include federated services across a diverse set of providers. The nature of this development method makes it difficult to identify the true root cause of performance problems and employ strategies to mitigate their impact. Learn the underlying characteristics of these environments identify key threats to performance and learn about a new methodology to mitigate their impact.
Now days, emerging application with diverse platforms has made a split choice of computing in the market. To manage the configurations of application via solitary console has given opportunity to Citrix infrastructure to avoid added human efforts. Integrating additional layer to the environment cause rise in performance bottlenecks and force to re think and plan capacity. This paper describes the different aspects through which constructive performance/capacity planning for citrix-based applications can be done. Major stress would be on collecting and interpreting performance stats/metrics
9118 - Performance Engineering and Load Testing
Looking to the previous surge in Managed Funds sectors of the monetary markets, the ability to accurately measure the risk-adjusted performance of IT applications have become vital. Without having benchmarks defined, managing performance of different service layers has become a cumbersome task. This paper would help users to understand the optimised model to label baselines. We have practised this model in multiple projects build on different technologies and proved it a good decision with PROS and CONS. This model is well modifiable to be adjusted with all SOA based IT applications.
IT Organizations have struggled to create predictive models to characterize how market, user and other demand drivers will impact their business systems. The lack of effective modeling causes uncertainty that can lead to over or under investment in resources or worse – system outages. A new demand modeling method has been created that maps market and usage demand to the API level to deliver confidence in decisions and more reliable capacity planning. This presentation presents a brief overview on the use of Bayesian Influence modeling to deliver highly reliable demand predictions.
Instead of worrying about what the definition of cloud computing might be, lets focus on how you can use it yourself to do large scale log processing. This tutorial will show how you can upload huge datasets to Amazon S3, crunch them with a large cluster of computers using Amazon EMR, and do it all from your web browser for a handful of dollars charged to a credit card.
9115 - Management and Reporting
Two significant trends are transforming the way that innovative IT operations groups function – laser-sharp focus on end user experience, and management of IT from a business transaction perspective. This session will focus on best practices for linking these two critical initiatives in order to ensure the efficiency and scalability of IT service delivery and to fulfill the goal of IT-business convergence.
The hardware and software technology grow so fast in the past few years and don’t seem to stop. The transformation of performance engineering in Linux world to support the faster, cheaper, better technology will be discussed and provided in this paper.
The focus of this paper is to guide software architects and performance engineers the important steps or methods to enhance and evaluate the existing software application to multi-core technology platform. The use of available tools evaluating software application scalability and performance report provides to support upgrading decision of the existing software application to multi-core technology for the maximize uses of multi-core technology to dollars spent.
Monitoring applications with Synthetic robots is not enough, and relying on silo-ed infrastructure tools to isolate problems is a complicated and inconclusive process. To truly understand how an application performs, the impact of poor performance and to quickly identify the fault domain, one must correlate Synthetic and Real End User Monitoring. Learn how a holistic approach to APM brings visibility through the Front-End, Middleware and Back-End database calls. The result is good Service Quality- the representation of a business service to meet its goals and deliver value.
Few things about data processing have changed more in the past ten to fifteen years than storage. The discreet machines formerly known as DASD have morphed into a black box of complexity and indirection. Does the left hand even know what the right hand is doing? When you are faced with I/O performance problems, it doesn’t seem so. We will explore the complex bread crumb trail from database to disk and try to find out what tools illuminate what portion of the path. What do they know and when do they know it?
9110 - Management and Reporting
Online service provisioning requires large numbers of servers. It is necessary to monitor them, evaluate if and when they are overloaded and how many of them are underutilized. This is an enormous logistical and analytical task. Commercially available tools are unsuited for such unique needs, and human analysis of large-scale systems is problematic. We have designed a system capable of collecting data from a large number of servers and providing a meaningful and timely analysis - on a daily basis.
9109 - Management and Reporting
Claude Shannon was a technical expert that formulated a theory pertaining to the security of passwords. Although this was completed in 1948, Shannon’s Entropy (as the theory was called) has an impact on future capacity management and security from an access point of view. This paper addresses the theory and its implications toward future capacity planning.
In this allegory, the young man Funes finds himself desperately frustrated by his village’s blind subsistence on a nearby cloud. All of the villagers’ basic needs are fulfilled by the cloud. All that a person is expected to do is "enrich oneself", freed to do so by this arrangement with the sacred cloud. Funes, however, will not rely on something that he doesn’t understand. He chooses to go on a journey to the cloud to get answers to all his questions. As with many such journeys, some of the deepest answers are found along the way.
9107 - Management and Reporting
One of the key enabling technologies for cloud computing is “virtualization” which has the ability to provide an abstraction of the computing resources. It makes sure that the resources are available to all the components of the loosely coupled Service Oriented Architecture (SOA) implementation on demand to meet the business performance Service Level Agreement (SLA). This paper would outline an approach that was implemented to test the performance of an enterprise SOA application to understand the overhead of virtualization, the crux of any cloud setup.
In order to obtain more throughput from the CPUs on a system without adding faster or more CPUs, vendors have developed CPUs with multiple threads of execution. In this architecture, the threads of a particular CPU share certain resources, such as low-level cache. In this paper, we will give a basic comparison between single-thread vs. multi-thread CPUs. Then, we will review operational performance of a number of different types of multi-threaded CPUs on various operating systems. Finally, we will demonstrate considerations and methodologies for modeling multi-threaded CPUs.
9105 - Performance Engineering and Load Testing
While there is a large variety of load testing tools aimed at the general web/web services based application today, there is a surprising dearth of options for load testing complex, composite custom-built systems that use non-standard messaging protocols. The paper will focus on a real case of implementing a Visual Studio 2008-based performance testing solution for in house built electronic discovery application, challenges and lessons learned, and unique aspects of using a performance testing tool that is really a part of the well-integrated development and testing framework.
Hard drives have been with us for over 40 years. Modern disk arrays first appeared nearly 20 years ago and revolutionized the storage industry, dramatically improving availability and performance. Now another revolutionary change in the storage industry is headed our way. A nanotechnology breakthrough has just been made that will make it possible to make hard drives that have 15 times the areal density that is possible today, have better environmentals, and have better performance.
System Virtualization allows multiple O/S images to execute on a single physical host computer. Measuring the host resource usage is straightforward and the necessary tools are included with most virtualization environments. However, complexities introduced by the different virtualization techniques create problems with measurements within the guests, resulting in missing model parameters. This paper shows how to use the Menascé technique that computes missing parameters with the Simalytic Modeling technique to predict application performance more effectively in virtualization environments.
Several computing clouds are available for putting processing in the internet between server farms and typical client locations. Among these: Akamai, Amazon, Google, and a range of others. The paper discusses some measurements that were made on these clouds. Among these measures of latency and uptime.
9101 - Management and Reporting
It is very hard to make a go-live desicion when we finish the IT project, and also we could confront some confusions when migrate to new srvers or bring a big change to the systems. Applied by the Apgar Score which is simple and quick method to assess the health of newborn baby in the delivery room, The Pears Score will help those who are making a go-live decision or choosing a proper treatment after migratoin or big changes. Both of them, Apgar Score and Pears Score, are designed for a simple and quick method to assess baby or new IT systems and deciding a proper post-treatment quickly.
We have an interface between two systems that is single threaded. The response for throughput of this interface is required to be under 30 seconds 98 % of the time. This is a description of the steps that were taken to document and report the response time of the interface, identify delays and the changes that were made to improve the response .
Survival Analysis is a statistical technique that deals with biological events and mechanical failure. It uses a nonparametric approach that handles continuously distributed and discrete data, detects significance, enables predictions, evaluates time to events (response time) and allows event censoring (equipment swaps). This paper will provide insight into how Cox Regression Analysis works and how it was used with SAS to identify critical components of computer performance and capacity problems. This methodology worked and helped improve performance in previously intractable areas.
9098 - Performance Engineering and Load Testing
The Software Tuning Agent (STA) is a performance analysis tool that infers an explanatory statistical model from hardware performance counters. It builds supervised learning models for single or multiple workloads. STA identifies and ranks performance events (e.g., branch mispredicts, cache misses) in terms of their contribution to execution cycles and optionally locates their source-code origins to aid in optimization. This is done through the use of model trees to account for the multi-phase nature of workload performance and for statistical interactions between performance events.
Customers expect a fast response from their systems, but what are the limits of the system? The paper analyzes a multi-class queuing model, which evaluates if jobs response time goals are reachable, and if not, what would be, in some sense, optimal alternatives. This model may be useful to estimate the limits of tuning, or to compare alternatives between hardware upgrade and tuning, or estimate a minimal necessary level of the hardware upgrade to reach a given set of response time goals.
This paper is a continuation of several papers I have written at previous conferences on the use of "R" (see http://www.r-project.org/) for performance analysis and data visualization. There were a couple of papers at CMG 2008 that got me thinking about how the functionality could be done with "R". Specifically these were papers on the use of pivot tables in Excel and on using "sparklines" to present information in a concise format. Through the use of examples, I hope to whet the reader’s appetite in considering adding "R" to their toolbox.
9095 - Management and Reporting
ITIL based process definitions have measurement & reporting mechanisms like Key Performance Indicators, Service Level Measurements, etc. As organizations adopt ITIL & mature, they realise a need for a “governance enabler” that takes a holistic view of all the measurement mechanisms & provides a single objective view of the process. ITIL Process Audit framework is a tool to ensure transparent measurement of process efficiency & effectiveness. This paper discusses Primer to Audit framework development, Developing an Audit framework, Executing the audit & Measurement and management reporting.
Every datacenter is facing cost reduction issue rising from IT budget cuts.In addition,the paradigm shift of IT technology is also inevitable issue to take for datacenter such as virtualization,cloud computing and etc.To resolve these impending IT issues,every datacenter needs automation.The datacenter automation increases operation productivity.It also keeps up with emerging technology by raising infrastructure agility.This session reviews strategy,organization,methodologies,measurement and evaluation for successful datacenter automation as a life cycle of automation implementation.
Cloud computing is based on the notion of shared computational, storage, network, and application resources provided by a third party. This papers explores in detail the concept of cloud computing, its advantages and disadvantages and describes several existing cloud computing platforms. It then describes quantitative experiments carried out using PlanetLab, a cloud computing platform. The paper concludes by discussing how the methods of capacity planning are impacted by the advent of cloud computing.
9092 - Management and Reporting
This article will develop to explain the use of information and data for the Capacity Management process (CMP).This data and information is known as metrics.These metrics are often utilized during the standard management and reporting process. It recalling some basic principles: If something can not be measured it can not be planned for. We will apply it to our CMP. We will give more detail around how metrics are being used and presented during the execution of the Capacity Management life cycle (using - delivering the right information to the right audiences ).
Cloud computing represents a paradigm shift in application software architecture. Although one can hide the complexities behind developing, deploying and maintaining the underlying, Internet-based cloud services exposed to users transparently, one can not build a highly reliable, dynamically scalable cloud computing infrastructure without considering various performance and scalability challenges. In this paper, some of the essential performance and scalability prescriptions for cloud computing are discussed to help design and develop high-performance, scalable cloud-based services.
To many application developers, an Oracle database is just a "data store" with an API that they call when they need to persist an object. It’s a helpful abstraction for managing functional complexity, but it can lead to some horrible performance problems. You can avoid those problems by better understanding what’s going on inside the Oracle kernel. It’s not that hard to do. The key is understanding how to measure how your code spends time inside Oracle. Once you’ve done that, your response time profile leads you exactly to your performance improvement opportunities.
9089 - Performance Engineering and Load Testing
Performance testing forms a fundamental part of any organization’s software development lifecycle. Running a test is only a small part of the process. Understanding the workload mix, the data flow and performing risk assessment is also essential. Once tests have been run and data collected it is also extremely vital to properly analyze and present it with clarity. This paper will list out guidelines that could be followed to successfully run performance tests. It will talk about creating better test plans, data analysis and reporting to find bottlenecks that lie in the application.
This paper covers our experiences with supplementing 3rd party tools with our own tool and processes. It will discuss the use of both application and system data for capacity management, anomaly detection/analysis and performance optimization. It will discuss a reporting architecture including an early detection feature leveraging "R" statistical analysis/graphics package and also a unique approach to the reporting user interface. Finally, it will include case studies related to our experiences monitoring over 2000 hosts running various versions of UNIX and Windows.
9087 - Performance Engineering and Load Testing
This paper discusses recommendations and considerations to plan and prepare for valuable performance and load testing. It includes a guide to identifying and developing the: 1. Testing Purposes 2. Roles and Responsibilities 3. Business Performance Requirements 4. Scope of Testing 5. Testing Environment Requirements 6. Usage Patterns and Transaction Mix 7. Proposed Test Scenarios 8. Load Generation Requirements 9. Proposed Monitoring, Tracking, and Reporting Once this planning is done the scripting and execution is “easy”. Scripting not included.
When examining the CPU utilization of midrange servers in an enterprise, at first glance, they often appear to be significantly under-utilized. Is that really the case? This paper discusses how High Availability, Load Balancing, Multiple Standards (platforms, operating systems, middleware services, etc.), and Application Incompatibility affect the amount of computing capability which is actually usable. Can you explain to executive management why 25-30% peak CPU utilization is reasonable if you engage in a lot of HA, load balanced, and multi-platform solutions? Would you like to be able to?
This paper presents approaches to consider as you decide the most efficient and effective Capacity Planning method to use to evaluate a physical environment that is shared by many workloads. Which of the following approaches will work best for your environments? • Analyze the system as though it contained one large workload • Analyze each workload individually • Analyze the two or three largest, most important, or most representative workloads While considering these options, keep in mind that the potential benefits achieved by the analysis should justify the time and effort invested.
The efficient utilization of computing resources is a major challenge faced by organizations today, as many resources remain idle consuming power. Software testing requires dedicated test labs with a need for constant scale-up of resources. This paper discusses the approach to combine Grid Computing with Virtualization technologies for creating Virtual Test Labs that use the spare CPU cycles of underutilized systems. Virtual Test Labs powered by idle desktops, results in reduction of operating cost and energy expenses with an offshoot of having a positive impact on the environment.
The paper discusses implementation strategies for Solid State Devices, considering both the performance improvements that the SSDs provide and the reduction in hard (spinning) disk activity that results from moving the most active data sets to solid sate. The paper will discuss the pro’s and con’s of data set, volume and storage group level migration approaches. The paper will show the savings in disks that are possible, using actual customer data.
A discussion of the architecture, installation and usage of Hyper-V with special reference to the performance metrics that Microsoft provide. The management tools that are available for Hyper-V are described. This paper describes various load tests that were run on Hyper-V and analyzes the performance data captured. Both Windows and Linux guests were used within these tests. It highlights the issues the capacity analyst faces when trying to understand guest domain performance. Some general guidelines on what to look for when assessing Hyper-V performance are discussed.
Since 2004, we provide a Saas Style Internet Banking System with the Japanese banks. The system provides services for 130 large and small scale banks, and 60,000 users with more than 300,000 page views per day on single server. Lead by its increasing users we have faced several performance problems. Though having difficulty in reproducing the system load, we needed to carry out a capacity planning in any way. This paper is about how we carried out a capacity plan by analyzing factors which caused performance problems in the past. We are now able to prevent performance problems proactively.
Paper evaluates capacity forecast based on resource utilization in the test environment; extrapolates it to production based on load profiles. Statistical analysis of the performance test outcome computes growth factor based on the rise of resource utilization subject to growth in load while the scaling factor is based on historical data of production resource utilization, with similar load on test. The production capacity is predicted based on the growth and the scaling factors together. Business transactions and volumes inline with market forecast is pumped in test environment.
9079 - Management and Reporting
With rising amount of electricity power and price of energy, electricity bills have become a significant part for today’s data centers. We present a framework to evaluate the power consumes of storage systems; In the framework, we take 4 factors into consider: 1.Configuration of storage system, such as add/delete a disk or disk enclosure. 2.Performance and capacity of storage system, measured by IOPS and throughout of storage system. 3.Typical I/O Activity, different I/O actives need different amount of power. 4.Active disks in storage system, with different configuration of RAID type.
While much hype exists in the computer industry around cloud computing, several emerging cloud computing models are becoming popular by early adopters. This paper provides an overview of cloud computing, a discussion of the different types of cloud computing models, and the impact of these models on capacity managers. Specifically, the paper will discuss how the three sub-processes of the ITIL model for IT computing are impacted by the three emerging cloud computing models.
9077 - Performance Engineering and Load Testing
Performance and availability monitoring are vital to ensure smooth running of all the business critical applications of an enterprise. Amid evolving application functionality and technology upgrades; it becomes difficult to manage application performance using primitive performance management techniques. It therefore becomes necessary to implement a cost effective monitoring solution that can adapt to today’s complex computing environments. This paper presents an approach to monitor performance and availability of globally deployed applications using HP LoadRunner and custom built utilities.
Server virtualization is commonly used for server consolidation. VMware ESX Server is the leading solution for Windows platforms. In general, the best practice is not to consolidate I/O intensive workloads to ESX Server. However, there is no quantified measurement so far on the overhead incurred by I/O activities. Based on theoretical analysis and benchmark testing, this paper quantifies the overhead caused by I/O, and provides guidelines on how to consolidate I/O intersive workloads to ESX Server.
VMware ESX Server is a leading solution for server consolidation on x86 platforms. Most performance analysis so far focused on CPU primarily. In reality, memory poses more severe constraints to the capacity. This paper discusses the major performance concerns regarding memory based on the architecture of ESX Server memory management. Although VMware Virtual Center provides rich memory metrics for both host and each individual guest, there is no clear explanation on the meaning. With benchmark testing results, the paper studies these metrics and gives guideline on memory analysis.
Traditional Performance and Capacity Management (PCM) techniques have been (by and large) successfully adapted to today’s complex and interconnected computing environment. However, there are “clouds” are on the horizon that will make PCM more difficult in the future. New technologies such as SAAS, virtualization, and cloud computing increase complexity even further and introduce new challenges such as loss of control over portions of your services. This paper will discuss these factors and present some strategies and tactics for effective PCM in these environments.
We consider issues that may arise in the I/O workload characterization for Parallel Access Volumes (PAVs). While for a single exposure I/O device service time characterization limited to the mean and the standard deviation is sufficient to predict the average I/O time in a realistic model, it is inadequate with PAVs. We also consider methods for matching real-life workload distributions to phase-type distributions, used to enable numerical solutions of queues with multiple servers, and we point out potential pitfalls related to their use in modeling the performance of PAVs.
Memory stubs are a specialisation of dynamic performance stubs which provides a framework for the simulation of the performance behaviour of software modules and functions. They can be used for a cost-benefit analysis of the gain from performance optimization and therefore, for a gain oriented performance improvement. It is also possible to identify hidden bottlenecks and the most relevant optimization candidates. This paper discusses a possibility to simulate the memory and data cache access behavior and provides methodologies for using memory stubs to optimize memory bound systems.
9071 - Performance Engineering and Load Testing
In this paper, First, we evaluate the EPT implication for the vConsolidation with latest Intel processor based system. Second, we focus on vConsolidation characterization with EPT enable. This paper contains a first set of scaling characterization data of vConsolidation. Then we give a simply model to predict the vConsolidation’s performance in the future at the end of this paper.
9070 - Performance Engineering and Load Testing
Once the web based system which is complex slows down, it is very hard to figure out the reason why the slowness occurs and how to improve the system’s response time because of the system’s complexity. However, there are an easy ways to figure out what causes the system to slow. TCP/IP flow patterns and Java application’s response time patterns give the idea and are very useful to find out This research will trace what are the patterns that initiate the response time slowness of the online web based system by the pattern, and guide the solutions for systems to respond more quickly.
The paper describes a reviewing methods of Oracle Plsql procedures effectively which leads to provide maximum improvement in execution time, its based on performance driven development Methodology.
Next generation business applications are based on highly distributed multi-tiered architecture. It is vital to mitigate any potential performance risks by evaluating system scalability in terms of hardware and software resources. JVM must be sized in order to meet expected load for business applications. This paper demonstrates modeling and simulation based approach for capacity planning of JVM resources, challenges faced, and benefits there of. It is intended to help capacity planners and system designers to explore a different methodology for managing systems capacity.
Your z/OS System and WLM manage different types of transaction and server workloads with multiple dispatchable units - TCBs, SRBs, enclave SRBs. Multi address space application scenarios use a combination of velocity and response time goals across multiple WLM service classes to manage performance. In addition, some of these workloads are also eligible to be redirected to zIIP and zAAP specialty engines on your System z processor. Let’s connect all these pieces together to understand WLM management of enclaves and what makes work eligible for zIIPs and zAAPs.
Cloud computing is the latest buzzword in the industry. The appeal of the services available from Google and Amazon have fired users imagination and created expectations for every data center. Just what is cloud computing? How is it going to effect my mainframe environment? How is it going to change the way my business operates? Where does SOA, service-oriented architecture, fit into all this? This session will talk about cloud computing and SOA, how they are changing the industry and how it is likely to effect mainframe IT professionals.
In lean times of frugal economic measures, it is essential to focus on effective capacity management practices. In enlightened times of sustainability, it is an advantage to satisfy ‘green’ criteria. In tough times it is wise to adopt a mean approach to all practices. Thus, the pragmatic ’lean, mean, green solution’ is to promote traditional core practices, updated as appropriate. Based on a new book (’Capacity Management – Best Practice’), this session highlights practical experiences in capacity management consultancy and gap analyses at a number of sites.
Simulation models offer a number of benefits for performance modelling and forecasting and should therefore appeal to a large body of practitioners in this space. At the same time, practical considerations in developing simulation models and the associated costs and effort often prevent their more widespread use in a typical computing environment. This paper presents some common challenges and possible solutions when accounting for workload and the infrastructure detail. Finally, an implementation of a performance simulation model based on the Ptolemy II discrete event model is outlined.
This paper will focus entirely on the dark side of SOA, the most expensive application architecture thought out by man(?). So far. If you want to hear about the benefits of SOA, go somewhere else. This is the other story - the one you wish you’d heard before entering SOA. And it’s all 100% real life examples. Of clever, well educated and highly ranking professionals wasting millions of dollars. It’s also a little bit about how to avoid that.
Using Business Intelligence principles and functionality for IT management reporting can quickly enhance the understanding of "who’s using what, for how much and why?" throughout the entire organization (also on the business side). The primary purpose is, of course, to cut IT operational costs without negatively impacting the business. But one side effect: the basis for a better dialog between business and IT, may turn out to be the real long term catalyst for innovation in IT operations. This paper includes real life examples from different business types, installations and platforms.
Trading volumes today are rising by more than an order of magnitude. Next generation trading systems are expected to process more than a million orders/second with an end to end latencies of less than 1 ms. A messaging framework is the backbone of a trading system. It enables communication between various processes comprising of a trading application. Hence it is essential that the messaging platform is capable of meeting more than just the end to end throughput and latency requirements. In this paper the authors share their experiences towards building of such a messaging solution.
9060 - Performance Engineering and Load Testing
The MPI open-system message passing interface is the dominant application programming interface (API) in recent generations of high performance computers (HPCs). MPI has a key role in exploiting HPCs’ massive parallelism and contributes to application performance. This paper presents synthetic kernel and natural benchmark test results from a highly parallel system, “Discover,” located at the NASA Goddard Space Flight Center in Greenbelt, MD and the results’ implications for performance tuning. Application programmers and system administrators both have a role in MPI performance management.
9059 - Performance Engineering and Load Testing
Multi-core processors dominate current mainframe, server, and high performance computing (HPC) systems. This paper provides synthetic kernel and natural benchmark test results from an HPC system at the NASA Goddard Space Flight Center that illustrate the performance impacts of multi-core (dual- and quad-core) vs. single core processor systems. Analysis of processor design, application source code, and synthetic and natural test results all indicate that multi-core processors can suffer from significant memory subsystem contention compared to similar single-core processors.
How does an IT organization get away from reactionary-based WAG planning, and implement ITIL Financial Management practices to predict current and future activities and costs, and the impact of change? Effectively implementing the ITIL framework’s ITFM disciplines requires a capacity-based planning model approach that aligns IT workload activities to business needs, defines IT workload drivers and related activity cost, and provides the capability to quickly and accurately define current costs, and estimate the impact of change. This session will describe this new IT cost modeling solution.
e-mail gateways are security appliances that are placed between a corporation’s internal e-mail infrastructure and the outside world. The message handling capacity of such appliances is a function of several factors. Ensuring that e-mail gateways have the right capacity is important in a corporate environment where the timely delivery of e-mail to business partners is vital to success. This paper describes the methodology used for evaluating and monitoring e-mail gateways, and shows a case study in planning for a number of business continuity scenarios at a large financial services company.
This paper describes the capacity planning methodology for providing Video-on-Demand (VoD) at a large financial services organization with over 40,000 employees and over 20 million customers. The methodology includes a trade-off between the cost of bandwidth and the cost of network caching devices. Selection of network caching devices, as well as their proper configuration and location, were key to providing VoD capability in a cost effective manner. Problems arising from high concurrency were addressed by modifying user behavior and by VoD content management.
9055 - Management and Reporting
With IT systems becoming the backbone of business operations, their performance plays a key role in driving the business growth. Understanding this fact, organizations adopt different performance management strategies, however in long term they find it difficult to evaluate and improve them. This paper presents a strategy which will help organizations to assess their processes on the key dimensions and provide guidance to evolve the management strategy for earning better returns on IT performance expenses. The paper also discusses a case study to explain the presented evolution strategy.
For enterprises, maintaining the health of the servers gains upmost importance to meet the stringent performance SLA’s. Though simple monitoring of servers and applications assist in identifying performance issues early, it does not provide enterprises the timely and holistic view to overcome the performance degradations and server downtimes. So the need of the hour in IT operations is to comprehend the varying capacity requirements due to business growths and transitions well ahead in time. Here we propose a solution for such proactive, continuous and holistic capacity management.
Traditional computer infrastructure management has been implemented by measuring and monitoring the infrastructure itself. The emergence of cloud-based services brings to the forefront the necessity to understand and manage the infrastructure at the level of business services, which are the services that the infrastructure delivers to the business. By managing at this higher level, “above the cloud,” the value of computing to the business becomes clear. This paper explains the concept of “business service governance” and a real-time “sense and respond” loop.
This paper describes the tools and techniques used for capacity planning of internet bandwidth at a large financial services organization with over 20 million customers. Since 1996 the variety and complexity of Web transactions has increased rapidly, along with network data volumes. Now over eighty percent of the customer contacts come through the company’s Web portal. In order to achieve 99.999% availability during stock market hours, a number of techniques have been used, including redundant network devices and firewall components, and routing over multiple Internet Service Providers.
9051 - Performance Engineering and Load Testing
The Storage Performance Council (SPC) is a cross-vendor team of storage performance experts that has built the industry’s first benchmarks for storage. These benchmarks are the standard for decision making in many organizations. The SPC has used real-world workloads as the basis of the benchmarks that are vendor-neutral and platform independent. Many SPC-1, SPC-2 and SPC-C (component benchmark) results have been published to date. This panel session will discuss the status of the SPC and the benchmarks available and under development including SPC-3 and the Energy Extension to SPC-C.
This is a real user experience. The initial needs were to consolidate the usage of a very wide spread and well known suite of software products (SAS), reducing its usage onto a single mainframe machine, mostly for cost reasons. We will show all the steps and the methodology used in order to make an assessment of its usage, the capacity planning study to check whether we could afford the consolidation and what workload was going to be affected by the resultant CPU shortage. We close with the achieved results.
The mainframe can be a very smart platform for the integration of energy efficiency and performance. Many organizations are reaching the limits of available space and power at their data centers. With server virtualization and consolidation capabilities and a green footprint, the mainframe is well suited to address these requirements. This paper will discuss a case study of the impact of additional memory on the performance and energy efficiency of a workload using energy management tools and techniques.
Applications are the reason that IT infrastructure exists, so their performance is business critical. Applications are also the area in tuning which can give the best results in terms of resource (and financial) savings. For these reasons a large part of EPV for z/OS is designed to allow users to control application performance and resource consumption and to help them to quickly identify tuning opportunities and abnormal behavior. In this paper many techniques to control applications will be discussed.
eBay’s older Appliance-based caching technology was migrated to Squid proxy caching. The result was a 34% reduction in power consumption, 49% improvement in capacity headroom, up to 75% reduction in rack space, improved response time, and lowered overall cost of operation.This paper describes the implementation approach, but concentrates on the evaluation criteria, the measurement techniques, and the quantification of performance gains from rolling out the Squid proxy on a commodity hardware platform. Authors do not intend to cover architectural and roadmap details that are confidential.
Abstract: The production LPAR is running at or close to one hundred percent and CICS response times are elongating and degrading. The customer wants to know what can be done to improve CICS response times short of installing an upgrade. This paper will address how through the use of Workload Manager, CICS response times and through put were improved while the processor was running at above ninety five percent capacity. A secondary benefit was the ability to create reports to better isolate and track the individual CICS complexes within the environment.
This paper presents a method for performance modeling and capacity planning for an e-business application built with SOA architectures. As SOA application is built with many services and components’, determining the best topology for deploying SOA components for optimum performance is always a challenging task for an SOA architect at an architectural stage of Software Development Life cycle. The method proposes a Layered Queuing Network model proactive approach for modeling a SOA application for performance modeling and capacity planning of servers at the architectural stage
The performance of computer systems and communication networks depends on the speed of individual hardware components and the processing demands of user workloads. Most user workloads fluctuate unpredictably. These fluctuations are traditionally characterized by random variables, leading to stochastic models of system performance. This paper presents an alternative characterization of uncertainty that does not employ random variables or other standard probabilistic concepts. The approach provides a new way to separate risky predictions from safer predictions derived from the same model.
Capacity planning/modeling was first developed in order to predict system responsiveness for interactive work. But many businesses have critical work which runs periodically (daily, monthly), non-interactively, and with an elapsed time requirement (“batch window”). Elapsed time prediction utilizes different modeling and analysis techniques when compared with interactive workload analysis. These techniques are outlined and applied in three case studies (UNIX/Windows systems): (1) one very resource-intensive job (2) multiple simultaneous jobs and (3) hundreds of smaller jobs.
9042 - Performance Engineering and Load Testing
Agile methodologies are one of the hot topics in the software development field right now. The problem for the people which main occupation is system performance is that Agile doesn’t consider performance as one of issues to be addressed by a software development methodology, and to make the things worse, following an Agile methodology can create resource intensive, non scalable applications, which are the nightmare of any capacity planner. The paper discusses why Agile methodologies are so dangerous, and the way to prevent and remedy that.
9041 - Management and Reporting
The current economic situation has forced IT organizations to critically look at their IT service management and how it can be transformed to reduce costs, people and resources without having an adverse impact on IT service delivery. This presentation discusses the authors’ experiences of the Mantras - “Strategy”, the Astras – “Tool” and the Shatras – “Techniques” that have helped some of the leading IT organizations achieve up to 30% productivity improvement in their IT operations especially at the Level 1 Service Desk.
9040 - Performance Engineering and Load Testing
This is a real life example of how we have reduced Batch Elapsed time for one of the US based financial organization. The application was already tuned to its best and it was a hard job to accomplish the goal. We adopted a two phase approach to solve the issue. We looked into conventional ways of reducing batch cycle by optimizing I/O, Increase parallelism and Improve application efficiency to reduce batch elapsed time. However, we achieved significant benefit by eliminating DB2 Database dependency of one of the most time consuming jobs by using db2 unload files instead of SQL to reduce I/O.
How does one go about applying current generation query tools to real world issues regarding the z/OS file system? Well, you dive into acronym hell that’s how! See how a cocktail of ETL, SQL, TCP, DCL, VTOC, VVDS, DSCB, and SMS might yield some surprisingly interesting answers to your questions about z/OS file system utilization, chargeback, and capacity.
9038 - Management and Reporting
The information in this paper is based on my experience ‘fixing’ the storage consumption process for a large a managed infrastructure services environment. The names have been changed to protect the innocent but the processes, tools, and data issues are very real. The accuracy of storage consumption reporting used for chargeback is essential in maintaining IT’s reputation and customer trust. This presentation discusses the challenges, processes and tools involved when establishing a robust SAN storage consumption process.
9037 - Management and Reporting
The paper will focus on two capacity management tools developed for an enterprise Windows environment. The first tool scans through the MOM database to find the amount of samples over a predetermined thresholds for specific Windows metrics. The other tool is a SAS program that produces heat charts for Windows servers. The scanner tool identifies servers that may exhibit potential capacity problems. The heat charts are used for subsequent lower-level capacity analyses.
IBM’s Design Center data center was running out of cooling capacity and needed to address continuing IT growth requirements.Energy efficiency for data centers has become an imperative area of focus as the price of energy increases and systems grow beyond the capacity of current facilities to supply their power and cooling needs.This case study discusses servers, storage, center layout, and thermal analysis and assessments.Energy management techniques including virtualization, consolidation and cloud computing are demonstrated and measuring, monitoring and managing tools are highlighted.
9035 - Management and Reporting
Mainframe computing is going to be with us indefinitely, but there still is a need to develop new professionals and to familiarize them with a working knowledge of z/OS. Education in the elements that are peculiar to the mainframe platform is necessary, but the veteran practitioners must still provide guidance to the newer professionals entering this arena.
z/Series is here to stay, and it is now being used more and more as a platform to support World Wide Web activity with the use of WebSphere. Some basic measurements should be undertaken to report on, and effectively plan for capacity by measuring WebSphere application activity both on an enterprise basis and by lines of business.
9033 - Management and Reporting
Processes are dangerous if we start seeing implementation as the end rather than the starting point. Using problem management this presentation follows the evolution of a process from first steps to mature growing entity. It looks a why processes must be continually improved, the danger of believing processes will always deliver and that its people and processes together that truly deliver world class results Finishing with a look at process measurement Every process has a measure Every measure has a target Every target drives behaviours Behaviours can hide the true picture
9032 - Management and Reporting
Ever seen a great presenter and wondered – how does he do it? Drawing on experience from comedy club stand-up routines, Toastmasters and workshops, we will help you make your next presentation a career-builder. The speakers discuss preparation and speaking techniques and will also examine what makes PowerPoint sparkle. Glenn Anderson is a long-time instructor and presenter with a wealth of experience developing and delivering very technical material. Denise Kalm also has experience with technical speaking, as well as presenting general interest subjects to demanding audiences.
9031 - Management and Reporting
To remain competitive, aerospace firms are automating business processes with ever increasing levels of data and application integration resulting in very large systems. They are also extending that integration to their partners around the globe. IT organizations must insure such systems are available, yet cost effective. This paper is a study of an ISO9000 registered IT service implementation that provides: measured quality, cost and cycle time for the service as well continuous improvement. The result maximizes investment effectiveness and minimizes disruption to the revenue stream.
9030 - Management and Reporting
Are mainframe environments more expensive than other platforms ? It doesn’t really matter what you do believe. What is really important is to find a way to reduce IT costs; otherwise there are many risks you could incur, and eventually your datacenter could even be outsourced! Based on a real life experience this presentation describes the results of an analysis focused on reducing the costs of a mainframe infrastructure. The number of different ways to reduce IT costs we discovered will amaze you. Come and share our experience.
Parallel Sysplex has been an integral part of MVS systems for over a decade - primarily to improve the availability of systems and applications. Many smaller MVS installations without stringent availability requirements have not implemented Sysplex. This paper takes a look at using the Parallel Sysplex Resource Sharing and the Coupling Facility technology to improve the performance and throughput of MVS systems and applications without increasing the MIPS or MSUs of the processing complex.
9028 - Management and Reporting
A performance test is run and the results are reported giving the users information about the test, the results reported show numbers that range from less than a second to a minute for certain actions. One user of the system will determine that the results are too slow and that something is wrong, while another user will accept the results not knowing that there is a problem. It is proposed that part of the reason for this is that the results are quantitative in nature and do not give the users information about what is really seen in the application when the final report is complete.
Performance modeling is mostly used to predict the system performance in scale out environments, not in scaled-up systems due to insufficient data. Besides, the industry benchmarks are used to linearly estimate application utilizations on different hardware, with no insight on other performance metrics. Here we present a combination of these two to predict the application’s response time, throughput, utilization etc in vertically scaled systems. We illustrate the methodology by comparing the model’s output with actual test results for a standard J2EE application on various machines.
LQN model is very effective for performance modeling of distributed systems. Merely lacks the depiction of some real world scenarios like heterogeneous clustering, service demand-hardware mapping, re-use existing components in multiple models etc. To address these, the plug-n-play approach is proposed, which de-couples LQN components for direct mapping to system architecting process. SPEC2004 model is used to demonstrate the flexibility it provides of tying available software components for required functionality, deploying them on different hardware and performing what-if analysis.
DTrace™ is a powerful diagnostic tool introduced in Solaris 10. Since its introduction, it has been implemented in other operating systems, the most noteworthy being FreeBSD™ and Mac OS X™. In this paper the author will use DTrace to analyze several applications. The author will emphasize practical use of DTrace as opposed to an introduction to language syntax. Using mpstat and prstat information, the author will illustrate what questions one should ask and how one can use DTrace to answer these questions. These DTrace inspired answers will lead to increased performance.
9024 - Performance Engineering and Load Testing
The size and complexity of software has increased tremendously and it is common to have applications built with millions of lines of code. Similarly, quality concerns of the released product are also increasing. Adhoc approach of testing is still very common in the industry.Mature and Effective testing methods can improve the quality of the software produced. The domain is a medical imaging domain. The applications were developed in .Net and based on client-server architecture. The engineering approach I have chosen for performance testing is called the “Measurement technique”.
9023 - Management and Reporting
Optimizing the mainframe data center environment can bring substantial and immediate savings with some incremental investment. Unfortunately, many IT organizations are not proficient at utilizing the levers available to reduce costs. During this interactive session we will explore opportunities for optimization in both outsourced and insourced environments along with the steps needed to implement your own successful mainframe optimization program.
9022 - Performance Engineering and Load Testing
An earlier paper provided a introductory tutorial on using the open-source tool JMeter to load test a web application. But that paper just scratched the surface of what JMeter is capable of. This paper continues where that paper left off, describing how you can make use of some of the other capabilities of JMeter to load test applications. This paper is provide as a tutorial giving specific steps to accomplish various tasks.
A common issue with getting Java applications to adhere to service level response time agreements is taming the garbage collection pause time. This paper examines ways to reduce the pause time, mentioning various options that are helpful in this regard. In addition, this paper covers the Concurrent Mark-Sweep collector, and the new Garbage First collector, both of which perform collections without pausing the application.
This presentation demonstrates the advanced techniques applied using Excel2007 to indentify underutilized servers from a large server firm. Server performance metrics were collected from each of the 350 servers running on at least three different Operating Systems using various collection tools and stored as csv files. We will show how Excel2007 can be used in various stages starting from data ETL, model building with what-if capability, analysis and validation.
This paper presents a regression technique to analyze large volume of historical time series data to compute monthly seasonality factors to be used in transaction forecasts for various midrange systems within a large GDS service provider. The proposed technique relies on analysis of air travel transaction data. The seasonality factors calculated will then be used in the internally developed forecasting tool as organic growth component to forecast monthly TPS demand for each midrange system. This paper discusses data collection, model development in Excel and computational stages sequentially.
This presentation discusses the multi-core and multi-processing computing trends which affect application deployments, and its pronounced effect on the I/O subsystem. Topics such as data growth, access patterns, traffic characterization, and the working set are used to introduce the need for technologies such as larger drives, flash disks, multi-tiering, dynamic cache partitioning, virtual provisioning and the like. Further, it examines the technologies, their lifetime expectancies, and their suitability to various workloads and applications.
9017 - Management and Reporting
This session will provide you with an understanding of performance management, and why performance management should be a critical, ongoing activity for your IT organizaiton. The capability of IT organizations to develop, and successfully adopt, a meaningful performance management framework is increasing, but the alignment of IT performance measures to organizational strategies and related measures, often proves elusive. Determining what to measure is the first critical step for your IT organization.
9016 - Management and Reporting
This session will provide you with a practical approach to using selected components of ITIL V3 to improve your IT business practices. ITIL V3 is a comprehensive approach to improving service management across the lifecycle of a given service. While improving business practices is important through the service lifecycle, a number of specfic ITIL processes are critical to business management. In this session, we use core components of the ITIL V3 framework to show you how they link together to create improved business performance for your IT organization.
Figuring out why a process in a batch window ended late often requires tracing back through the critical path to look for anomalies. In a complicated batch schedule, determining which jobs are on the critical path can be a seemingly daunting task. Various commercial products can be used for this task, but they may not be necessary. The code needed to determine which predecessor jobs constitute a particular job’s critical path is relatively, perhaps surprisingly, simple.
This paper will demonstrate how activity-based costing (ABC) solution can be applied to cloud computing delivering a unified approach to performance management & cost management of cloud computing based applications & services. It will present a highly extensible metering runtime for modeling the execution of software in terms of metered activities, cost groups (centers), resources and meters. It will demonstrate how such models can serve in managing the performance, cost and capacity of applications and systems. Various metering strategies will be introduced that address runtime concerns.
In 1989, Stephen R. Covey penned “The Seven Habits of Highly Effective People,” designed primarily to help people become more effective in interpersonal relationships. Since performance and capacity planning is so much more than metrics and machines, this paper explores using those same seven habits in the arena of CP/SM job success. As Covey does in his work, each “habit” will be explored with stories from the field. Step up your game and move to successful interdependence.
9012 - Performance Engineering and Load Testing
While not a direct capacity or performance problem, the performance analyst is frequently engaged when systems provide poor end-user performance during a partial failure. This paper identifies the main drivers of application "brownout" and identifies multiple strategies to attempt to resolve those issues.
9011 - Performance Engineering and Load Testing
A case study on evaluating virtualization in a 1,000 system environment where the virtual machines will each use two full CPUs. This is not normally considered a virtualization workload, but was worthy of full evaluation from virtualization management capabilities and technology refresh needs.
The conventional wisdom is wrong: you don’t need production equivalent hardware in your test environments. Many papers and presentations at CMG have repeated as if it is common knowledge that test environments have to be equivalent to production. This paper outlines where the fallacy is in that assumption and how you can achieve superior costs by carefully architecting your performance testing environments. Examples of significant testing result improvements are also included.
SOAP and the demands of a real time world are causing a significant growth in the implementation of transactional systems. Some of these systems are designed to handle both batch and transactional usage, but many are architected with only proper transactional usage in mind. When it comes to usage by external entities sometimes their design is abused by trying to hook a batch process to them. This paper will provide a base set of analysis techniques to detect when a transactional system is being used for batch processing.
This study looks at a variety of IT performance data (OS metrics, QoS metrics, etc.) to determine how well these metrics can be modeled as normally distributed. This study came about as a result of multiple IT management vendors providing metric base-lining (or dynamic thresholding) capability based on the assumption of normally distributed data. This study will show that IT data does not behave normally and in fact takes on an infinite number of distributions. Thus any base-lining techniques using normal distribution assumption as the basis of their analysis will be significantly inaccurate.
9007 - Management and Reporting
Today’s Microsoft® Windows® environments are riddled with legacy applications requiring Graphical User Interface interaction. These processes cannot be automated due to specific Windows® control manipulation and pre-packaged GUI only functionality. This functionality prevents many applications from utilizing scripting tools, such as VBScript and Perl, to automate manual processes. This leaves the IT professional with a plethora of manual processes to maintain. This article will provide the reader with the tools and knowledge necessary to automate many of these manual scenarios using AutoIt.
9006 - Management and Reporting
Data graphics can inform, influence, educate, warn, reassure, irritate and sometimes get the “messenger killed” because of what and how they present the myriads of IT service management (ITSM) metrics and the statistics derived from them. The emphasis of this presentation is on examples of graphics presentation techniques that help illustrate the service level metrics collected by any of the major computing platforms while also insuring that the correct message is delivered.
This session includes presentation of the essential z/OS V1.10 RMF reports for performance management and capacity planning. For maximum effectiveness on the job, attendees will learn (a) important considerations for parameters affecting the data collection, (b) the minimum set of reports required to support particular IT system management (ITSM) activities, (c) what are the important fields on the key reports, and (d) how to avoid some potential pitfalls. The emphasis will be on quick techniques that help “mine” the metrics collected by RMF.
Workload Manager (WLM) can intentionally or unintentionally deliver pain, manifested as severe performance degradation, or pleasure, manifested as better than expected performance. For optimum and cost effective operation, attendees will learn how to insure service levels for business critical applications. This session will dig into (a) the WLM service policy options by which such pain and pleasure can be controlled, (b) how to tailor these options for the results one seeks, (c) actual examples that delivered unintended results, and (d) the recommended solutions for the discussed problems.
9003 - Management and Reporting
Should businesses considering computer systems acquisitions be excited about the IBM z10 announcements? Should performance analysts and capacity planners in particular be more or less excited by the new offerings? What are the minimum requirements for exploitation of z10 features? These will be the three key questions among the many answered by the presented independent analysis of IBM’s announcements. You will get your questions answered in a session where everyone is encouraged to bring problems so you can go home with answers.
Performance management controls of CICS Transaction Server (TS) greatly affect performance and the effective capacity of a processor complex. This presentation focuses on the best practices for CICS controls and the top z/OS factors which affect a CICS region’s overall performance and maximum practical capacity. Workload Manager (WLM) definitions that may help or hinder CICS will be also discussed in a session where everyone is encouraged to bring questions so you can go home with answers.
This session includes presentation of the essential CICS statistics for performance management and capacity planning activities. For maximum effectiveness on the job, attendees will learn (a) important considerations for parameters affecting the data collection, (b) the minimum set of reports required to support a particular activity, (c) what are the important fields on the key reports, and (d) how to avoid some potential pitfalls. Samples of the most useful reports will be presented. The emphasis will be on quick techniques that help us “mine” the mountain of information collected by CICS.