Dr. Terry Critchley
The theory of queuing, as we have seen in many articles and books, assumes particular types of distributions for the arrivals at a server in order to calculate the parameters of interest in service performance analysis. Get the distribution wrong and you have the calculation based on this fact, that is, fairly useless. In addition, it is sometimes possible to alter the input distribution if you have control over the input, for example, how many people can enter data over what period.
In other cases, for example a public web site, this regulator mechanism is not readily available. This is my view and experience of online work and the applicability of standard arrival queuing theory to it.
- normal data entry and business as usual (BAU) online OLTP (online transaction processing) work. This is dictated by the number of people involved and the availability of the work to be processed. My feeling is that this will be constant or constant in phases, like a bar chart with zero activity gaps between the bars rather than a continuous distribution.
- web based work with a large enough population users will be the closest to standard queuing arrival distributions, mainly because of this and of its randomness.
- standard queries and ad hoc queries might be random but probably with a relatively small population of users.
Whatever the arrival pattern, when the first server has processed work with an arrival profile f(t), the outgoing profile proceeding to the next server will not necessarily be the same f(t). An example of this is a server processing transactions and passing requests on to a disk, then passing resultant output to a user or other system via communications and so on through the system.
I have called this phenomenon dispersion, for want of a better term. I have never seen this discussed anywhere in the vast literature on queuing theory, although I have seen it hinted at in traffic queuing analysis. This sort of analysis finds its way into theoretical traffic (motorcar etc.) analysis. For example, the approach pattern of traffic at a roundabout is not the same as the exit pattern at any exit point from that roundabout, even if there was only one exit, that is, straight ahead.
How this can be treated rigorously is anyone's guess and so arrival distributions must be assumed for each server - first server in the chain and succeeding ones. I feel the lesson to be learned is not to try to model several resources in the chain as one exercise.
Sample OLTP Scenario
Assume we have an organization with 200 users accessing the system for transaction based work and internet work. Let us just examine the transaction work as an example to see where input distributions and service time distributions lead us.
Firstly, the pattern of access will be dictated by the working day and the availability of data and queries etc. for system access to be required. This pattern may or may not conform to any distribution and in my experience will be either a steady stream or very little activity, depending on what needs to be done. My feeling is that such an environment will generate a bar chart sort of distribution rather than a neat mathematical function as outlined above.
Secondly, look at the journey to and from the system by a transaction (Xn). A typical input might be 100 or so characters and an output of many hundreds, data entry maybe hundreds in and fewer out to the submitter. If all the messages are the same length, there would be a constant service time according to theory. I do not believe this would be true for transactions (Xns) as the arriving 'customer'.
An Xn transaction is driven by program invocation, the user input telling it how to proceed and then the Xn executes. The service time for the input is likely to be fixed (fixed length), but the service time for the ensuing Xn might not be. For example, a payroll transaction (Xn1) may be executed with the flag 'No tax due' on the input but a second may have 'Tax Due', meaning extra processing (and extra I/O and network traffic too, probably) to handle the tax part of its work.
The input length may be fixed but the service time will not be constant in this case. In fact, depending on how many paths and related actions there are in the Xn, the service time could vary quite markedly and almost certainly will not follow a mathematical distribution.
There may also be a mix of transaction, Xn-a, Xn-b, ..., Xn-n, each with its own variations in path length and hence service time to confuse the issue further.
I could be miles off target in my assertions here but am willing to be convinced by reasoned argument of the error of my ways. This is illustrated schematically in Figure 2.
There will be, in general, a pattern of work by people entering these transactions which is often flat during the whole working day or, in some circumstance, a series of flat peaks throughout the working day.
The view as expressed above is formed by the difference between formal mathematics and the actual behavior of IT equipment. The overarching concern is not to take the theory too far as it is mainly based on the following principles:
- the arrivals fit a distribution pattern - Poisson, exponential etc., which may not be true
- the service time also follows a mathematical pattern, which may not be true
- a fixed dispatching mechanism is used - first come, first served (FCFS) etc.
- the queue length can be unlimited
- the incoming population can be infinite or not
- even simple transactions/web requests travel through several resources where queues may form and the arrival patterns may change from one resource to the next. For example, if data arrives at a network for transmission with arrival pattern f(t), there is no guarantee it will arrive at the next resource after transmission with the same pattern.
The combination of these choices can be analogous to taking about 6 or so different medications each day without anyone really knowing how they combine in effect (and in reality, they don't). The variances from the stated assumption may (or may not) assume great significance in the final outcome of the calculations if their effect is amplified by the differences.
All this is not to decry tools and simulations but their limitations should be recognized. Web site work may well follow queuing theory to some extent but transaction/query work maybe not, resembling the distributions in the Figure 1 above rather than Poisson and the rest. The potential variation in service time for a single transaction muddies the water still further.
The safest way would be to seek some consensus, where feasible, from:
- similar situations and configurations ( 'our system looks just like that one, whose behavior is understood')
- rules of thumb (ROTs), guidelines developed from experience and consensus
- simple calculations (in the appropriate circumstances)
- modeling and simulations (as above)
- performance testing tools
- standard benchmarks of the appropriate type, that is, resemble the workload you are trying to size and understand
- personalized benchmarks which match your requirements closely.
Consensus means looking at different approaches (opinions in real life) and seeing where optimum agreement lies.
High Performance IT Services
High Availability IT Services
The reference link below has a table which tries to show the applicability of various theoretical and simulation approaches to IT issues and it implies that there are several holes in the wholly theoretical approach.
Traffic Behavior and Queuing in a QoS Environment
About the Author
Dr. Terry Critchley is a retired IT consultant living near Manchester in the UK. He joined IBM as a Systems Engineer after gaining a PhD and spent 24 years there in a variety of accounts and specialisations, latterly joining Oracle for 3 years. He joined his last company, Sun Microsystems in 1996 and left there in 2002 and then spent a year at a major UK bank. He is now an author of numerous IT articles and the following books: