Are Network and Application Performance Related?
Where to focus during Early Development for Network Performance Optimization
Author by Madhu Tanikella
Performance of any enterprise software system depends on multiple layers such as Hardware, OS, Software (Web, App, DB and Middleware), Application (Code and Configuration) and Network. It is essential to focus on all these layers proactively right from Requirements to production-rollout phase, to make an IT system high performing and scalable. Specifically, to ensure that a given IT system is performance-optimized with respect to the Network Layer, proactive focus on Application Design, Architecture, UI Requirements and Infrastructure Planning by Architects/QA Teams/Managers is definitely required.
In this article, I would like to share few insights about the Network Layer, its characteristics and what needs to be proactively thought out during the Requirements and Design phases (where Application Performance Strategy and Planning activities take place). By doing this, you can minimize any surprises with respect to application performance issues that appear to arise from the Network Layer in Production environments.
‘Network Bandwidth’ is one of the common terms in Networking and is defined as amount of data that passes through a network connection over time as measured in bits per second (bps). Each network is rated with certain capacity which represents the Maximum Throughput that can be supported by that network link. Based on the capacity, different network types such as Dial-up, Cellular, DSL and Broadband exist. However, due to many physical limitations, the actual throughput of a network link, in general, is ~20% less than the rated capacity and it can vary.
Another important characteristic of Network is ‘Network Latency’ which typically represents the Elapsed Time to transmit a specific amount of data from one end to the other end of a network link.
With this context, I would like to enlist a few insights and solution approaches that should be kept in mind during early Development (for applications built from scratch) OR to strategize Performance Engineering plans for applications running in production to ensure minimal performance issues due to the Network layer.
- Capture and understand the different types of Network Links that exist between different server components in a given IT system. For instance, in a payroll processing system that I worked with, 100Mbps link exists between Web, Middle and Database Tiers and 1Gbps link exists between Middle Tier and Distributed Caching Tier, primarily to avoid any sub-second delay from the Caching layer.
- Understand the Physical Location of Servers (across Tiers) to assess Network Latency that can play a significant role in overall application performance. In a few scenarios, the Application Server tier interacts with WebServices hosted by a 3rd party vendor over a WAN, and the latency varies with the amount of data transferred in the WS calls.
- Be aware of the specific details of network elements such as Routers and Switches and keep track of them in physical architecture documents for QA and PROD environments so that it’s easy to isolate any latency issues that might happen over a period of time.
Simple but effective tools that can be used in any IT infrastructure are ping and traceroute – they provide network latency and the number of hops to transmit data in a given network link. Based on my experience, the component or layer causing latency can be identified just with the combination of these two simple tools.
I would like to share my experience with a complex Payroll Processing System developed and hosted by one of the biggest Human Capital Management Firm in USA where end user response time deteriorated suddenly in one of their Test Environments, with high response times observed in the Application Server’s WS calls. This happened when a release was pushed to the Test Environment to certify its performance prior to deploying in Production.
This was a multi-tiered, distributed, Enterprise application that has Web, Application, Portal, Database and Distributed Caching Servers which are connected on a 100Mbps LAN. Between the Application Server and Portal Server, WebService (WS) calls take place and the response time went up as high as 40 sec, where it used to take < 1sec prior to this new release. In order to figure out if the latency is from the network layer or Application Server or Portal server, ping was used from App Server to Portal Server with different scenarios (scenario 1: with Router & scenario 2: bypassing router). We figured out that the Router sitting between Application Server and Portal Server was causing the latency due to multiple requests flooding this Router. When we bypassed the Router (and reset the router physically), WS response time came down to <1sec confirming the issue with Router layer. The complete exercise was done in 2-3 hours. Further analyses for request flooding revealed that retry logic is not limited with MAX attempts in Application Server code which caused request flooding on the router.
- Analyze end user types and their mode of Application access using different network connections which in turn can have definite impact on end-user response time.
For instance, in Banking and Insurance Domains, Field Officers generally use PDAs/Smart Phones for registration/quote creation activities in the field over 2G/3G networks. For such cases, plan for performance benchmarking for different network bandwidths to understand end-user response behavior, in addition to standard Load/Stress/Volume tests.
Alternately, plan to create a mobile application that is light-weight compared to the Web version and host it in a different server. By doing so, Mobile Users are isolated from Web Users and data transfer over network can be minimized to provide faster end-user response.
- Analyze and Review UI Requirements (during Requirement & Design phases) primarily from “web page size” aspect so that application can be optimized early-on. The bulkier the page, the more the data to be transferred on a given network link which leads to sluggish end-user response times. Tools such as HTTPWatch can be used to analyze the number of hits (for different resources such as .css, .jpg, .js ) to the server and their respective size.
- Analyze and review Application for design aspects such as network chattiness which has a significant impact on application performance. Aspects such as ‘many fine-grained Web Service calls’ OR ‘Remote EJB calls within LAN /External (3rd party) systems’ are a few root causes that can cause heavy penalty on networks especially when the concurrent user load or transaction load suddenly increases (typically for bursty loads).
For instance, an Auto Insurance Dealer Application used by one of the biggest Auto Manufacturing Companies in USA, which is used to generate Insurance Quotes for different vehicle types, customers and countries, has a business flow that invokes a Web Service (WS) from Application Server to Rules Engine for every dropdown select. There are around 10-12 such selects in a single UI page and for every WS call, an XML with 10000 elements used to be sent over LAN whereas the Rules Engine actually requires only 10 elements. This caused unnecessary Network Latency and impacted server side response time for the specific page. By changing the application design using XML templates, performance of the webpage improved by almost 50%. This illustrates a bad Application Design that impacted network layer significantly.
- Ensure to enable HTTP Compression on Web Servers so that web clients receive data in compressed format and hence utilize available bandwidth effectively and reduce network latency.
- Understand if the Network Link is configured as Simplex, Half Duplex and Full Duplex – this configuration constrains the actual bandwidth that is available when compared to the Rated/Theoretical capacity.
o Simplex mode indicates that data can be transmitted in only one direction.
o Half-Duplex indicates data can be transmitted in both directions, but only one direction or the other can be utilized at any given time
o Full-Duplex indicates data can be transmitted in both directions at the same time.
Network layer and Application Characteristics have a strong correlation, and in order to minimize performance issues arising from Network layer in a PROD environment, where peak and bursty workloads typically are expected, proactive focus on application design, infrastructure, and UI requirements is quite essential.