In continuation to my previous blog title ‘Dealing with Performance Challenges of Load Burst scenarios’, I would like to share few thoughts around how to deal with performance challenges of Load Burst situation in this blog.
Let’s start with figuring out if assessing performance for load burst is any different from traditional performance testing.
How is Load Burst Assessment different?
Traditional performance testing for IT systems typically focuses to benchmark performance for “known-peaks” and to some extent seasonal demands (i.e., 1x, 2x of peak loads). However, performance assessment for Load Bursts requires validation for very high concurrency and from various geo-locations – hence, the strategy to be followed is a bit different and it requires special focus to some of the following aspects:
As Load Bursts typically represent loads in the order of Millions (such as 100K, 200K), one of the key things required is the Testing Infrastructure to simulate such high concurrent load which is two-fold:
- Load Testing tool with licenses for high VUsers (in order of Millions)
- Load Generators with adequate hardware & scale to simulate such high concurrency
Since simulating burst loads is not a regular and frequent scenario (unlike standard Peak load), it is not economical to acquire a load testing tool with perpetual license for very high VUsers. Hence, it is recommended to choose a Cloud-based Load Testing tool that can:
- Provision Load Generators from cloud on-the-fly without any prior setup – helps IT teams to primarily focus on Load Testing scenarios rather than setting up the testing infrastructure
- Support ‘VUser-Hour’ (VUH) licensing which is pay-per-use pricing model – helps buy the license for a very specific VUser scenario and for a limited time (hence cost-effective)
Besides, the end-users are global in nature, hence it is essential to simulate load from various geo-locations that can be achieved by cloud-based load generators to simulate real-world scenario – Also, it is essential to assess the performance for various end-user connectivity which should be done using Network Virtualization where different network connectivity (such as WI-FI, 2G, 3G, 4G, LAN etc.), bandwidth, packet loss are configured to meet given application landscape.
While the simulation of load burst is one of the aspects, the ability to monitor a given IT system across the technology stack and multiple components and providing deep-dive analytics is another key thing that helps in identifying the performance hotspots - be it code, configuration or hardware resource usage etc. and eventually providing optimal tuning solutions. It can be done with the help of Application Performance Management (APM) tools such as Dynatrace / HP Diagnostics / Wily Introscope / App Dynamics, to name a few, that provides visibility into code, hardware resource utilization and application run time resource utilization such as Heap usage, Garbage Collection ,threads usage, Sessions, database connection pools, JCA connection pools etc. It helps Dev and Support teams to with quick insights into the application infrastructure and the layer (i.e., web server, application server, database server, middleware, backend, 3rd party components etc.) where the corrective action needs to be taken to handle the load burst well in advance.
For instance, consider a 3-tier web-based enterprise Retail Shopping application that is being accessed by end-users thru Internet is to be tested for 100K transaction load in 1 hour to validate application performance anticipated during Holiday Season. Load test using a cloud-based performance testing tool indicated very high response time when load is increased to 100K from known-peak of 5000 transactions/hour. Deep-dive analysis using one of the APM tools during the test indicated that current 2 instances of Web and 4 instances of Application servers cannot handle this burst load as they are hitting CPU bottleneck. Also, few code level hotspots are identified using the call-profile analysis. For this scenario, it is suggested to tune application code for the identified method calls followed by creating additional VM instances for Web & Application Servers dynamically using Virtualization.
Having said that, let’s take a look at the summary of the benefits by implementing above approach.
- Proactive load burst assessment eliminates surprises in PRODUCTION environments and increases predictability w.r.t application performance, scalability and reliability
- It gives insights about the weak-link of the application which needs to be focused immediately thereby reducing time-to-market significantly
- Helps reduce the cost due to proactive hotspot identification which otherwise will be costly if identified in Production
- pay-per-use licensing model helps organizations lower their Total Cost of Ownership (TCO) esp. w.r.t testing tools licenses for very huge loads
- Deep-dive analysis cutting across technology stack helps uncover performance hotspots quickly and effectively – which in turn helps in reducing the cost and time-to-market
- Organizations can be more agile to meet their business needs and end-user communities
Load Bursts are becoming a norm for IT systems in domains such as Retail, Financial Services and Insurance due to ever-increasing user base and globalization of the businesses. It is essential to understand the performance and scalability challenges that arise from Load Bursts for IT systems and be prepared to address them proactively. After all, for business and IT teams, proactive information with actionable insights is the key to manage application performance effectively without losing out to competition.
About the Author
Madhu Tanikella is a Senior Technology Architect at Infosys, Hyderabad. He has 15+ years of experience in providing delivery and consulting services in Performance Engineering & Performance Tuning for enterprise software systems that fall under client-server, multi-tier (Java/J2EE), SOA, Messaging Middleware architectures, Virtualization and Cloud models and custom-built applications with multiple COTS products. He has specialized expertise in the areas of NFR & SLA Validation, Workload Modeling, Performance Modeling, Bottleneck Identification and Performance Tuning (of Java/J2EE, Oracle DB, TIBCO Business Works and IBM MQ), Capacity Planning, Hardware Assessment and Sizing for distributed software systems. He also has expertise in assessing performance practices in an organization and setting up Performance Engineering CoEs for various clients in different Domains such as Financial Services, Manufacturing, Insurance, Healthcare, Retail etc. Madhu Tanikella is certified in IBM Cloud Computing Architecture and IBM SOA Solution Designer. He can be reached at [email protected]