With the rapid growth of cloud technologies, the Internet of Things and modern concepts like bring your own device, the role and purpose of the service desk has evolved. In previous generations, many organizations might consider mature IT service management tools and processes as ‘nice to haves’, often at the expense of higher profile and more visible initiatives and priorities. However, in the modern IT environment, for an organization to truly achieve agile IT operations and to adequately support new and emerging technology requirements, mature service management must become an organizational imperative. I’ve taken over 200 clients in both commercial and government organizations on this journey and there are fundamental truths that can dramatically impact risk, user experience, and functionality when upgrading a service desk. To help others expand their skills and navigate this effort successfully, I organized my recommendations around three key domains:
From my experience, these best practices will help you plan for crucial needs and ensure that the product serves the demands of the enterprise. By planning for these big rocks, your upgrade program will be more likely to stay on-target and achieve the results necessary to advance your service desk to become best-in-class.
The process for conducting requirements analysis is very similar to that of building a home. If you begin building without a solid design, or building without knowing the buyer’s desires, the local zoning laws, or electrical and plumbing codes—the finished product is not likely to be well received or effectively executed. When designing a service desk upgrade it is important to understand and clarify the requirements so that you can deploy the right system for your company.
So, which requirements do I recommend examining? I believe it is critical to understand requirements related to Performance, Availability, Scalability, Security, Environments, and Functionality (PASS-EF).
Performance – Yes, you are correct that Performance, Availability and Scalability overlap. Performance is a requirement that needs to be regularly monitored because of its impact on overall service desk effectiveness. Failure to meet the performance expectations of end-users can have a significant negative impact on adoption. Every company has had their share of performance issues, and even a well-known, highly mature, on-line retailer had availability and performance issues this past July (2018) during a summer retail holiday. Thus, it is very important to understand what the performance expectations are for your service desk and to outline those requirements in the design process. An ITSM system needs to respond quickly to requests, and this can be achieved by including the correct sizing on all resources, servers, and databases. Ideally, additional loads should be modeled to ensure performance goals are achieved. Additionally, I suggest that daily, weekly, quarterly, and seasonal performance reports should be analyzed and bottlenecks should be mitigated once the upgraded service desk is launched. This also helps to show the relationship between performance, availability and scalability.
Availability – Most environments are available 24×7, even if the support staff isn’t. Having an ITSM environment accessible will create efficiencies for the company. So, how do you make that possible? The addition of self-service capabilities make 24×7 availability more cost-effective because you can forego 24×7 staffing. Issues like password resets or searching for knowledge articles can all be done without the intervention of a human.
When designing for high availability it is important to eliminate all single points of failure. As an example, when designing a fault tolerant system that would include leveraging load balancers, dual routing paths, clustered databases, and redundant single sign-on components. In all cases, having a Disaster Recovery Site or a Continuity of Operations Site (COOP) is recommended.
Scalability – It is critical to ensure that the ITSM environment can handle a broad range of activity from resource lows to peak demand. Every company and sometimes different divisions of a company will have different scalability needs. Regardless, the plan for scalability needs to ensure that all end users, no matter when they are interacting with the service desk, experience optimal and consistent response times during all periods. Virtualization technology can allow dynamic resource allocation to eliminate any server bottlenecks and should be part of the scalability considerations. In addition to the current load, any growth should be factored in. For instance if additional departments are going to be on-boarded, the ITSM system must be able to expand horizontally to accommodate the increased load. Given that, expansion plans must be included as a scalability requirement when designing a service desk upgrade.
Security – When identifying security requirements for your service desk, I recommend incorporating these three methods to secureit: multi-factor authentication, single-sign on, and a secure protocol like HTTPS (Hypertext Transfer Protocol Secure). Multi-factor authentication requires more than one factor for authentication from independent categories of credentials. The categories can be chosen from ‘something you know’, ‘something you have’ and ‘something you are’. Login into a system with a token (something you have) and a password (something you know) would satisfy the multi-factor requirement. Once an individual is authenticated with single-sign on, they won’t have to re-authenticate and thus making the end-user experience not only more secure but more efficient. HTTPS allows for secure communication over the internet by encrypting the data and has become the standard. Data should be encrypted wherever possible for greater security. These items used in combination are the key components in securing your service desk.
Environments Needed for Testing and Mapping – Another important decision is to determine the number of environments you will need to effectively test the service desk before it is launched. At a minimum, you should plan for User Functionality Testing and Load Testing. It is common to have a Test Environment and Simulated Production environment, which provides dedicated platforms for building and testing, thus reducing the risk when functionality is promoted into production. It is recommended that a minimum of two environments are used and, if the budget can support and there’s a need, up to four environments should be built. Further, the purpose of each environment needs to be documented so that the expectations can be set appropriately. For instance, User Functionality could be tested in the Test and Simulated Production environment while Load Testing can be performed in Simulated Production.
Functionality – It is important that your IT Service Management system looks and feels modern because most users demand social media-like interfaces. They become frustrated when they don’t have a similar experience with the service desk at the office as they do when posting to Facebook. So, investing in the latest technology helps you get the newest capabilities that will ensure your service desk delights end-users. That said, being clear on the functionality requirements prior to selecting the service technology will set you up for success. Those companies and agencies I’ve consulted who completed this step successfully leveraged more of BMC’s advanced functionality. Among the mistakes I have seen, the most common occurs when the company didn’t establish the functionality requirements ahead of time. Then, after purchasing their technology, they added a series of complex integrations which burdened the system and ultimately slowed performance. Additionally, this makes their future upgrades incredibly complex and expensive. Customizations should be limited as much as possible as this will help the service desk function at its peak. One module that makes sense to customize is the service catalog because that is what enables efficient self-service. Efficient self-service improves the functionality and a colleague, Michael Roper, has created a two part blog on Building an IT Service Catalog, and the first part can be accessed here.
Reporting is a key component of any well-designed Service Management system. Regardless of which tool you choose, the architecture and design should be discussed with key decisions agreed to before beginning the upgrade. There are many types of reports to consider for your organization. The out-of-the-box reports provided on the ITSM system are a great start. These encapsulate use cases including:
“The number of critical incidents that are open.”
“The total number of tickets closed per day.”
“Number of incidents that are assigned per support group.”
Additionally, there are reporting tools to capture the number of assets that are deployed. This provides IT departments with enhanced visibility to allow for the following use cases:
“The number of Cisco Routers that are deployed.”
“Total number of Oracle Databases that are running”
“Total number of Adobe Pro application are deployed”
By having this information, IT departments can have better accounting of their assets and leverage this information to negotiate more favorable terms with their vendors. CMDB’s – After 25 years in IT, I can say that I’ve seen the whole spectrum of CMDB (Configuration Management Database) maturity. CMDBs are more beneficial to an organization the more accurate and complete they are. However, they don’t have to be 100% accurate or 100% complete to derive benefits. It is important not to over populate it; in other words, don’t add attributes that don’t serve a specific purpose. For example, some of my clients have populated performance information into a CMDB. The issue with doing that is the performance information, like CPU Utilization rates, change moment by moment and this information is better housed in an Availability Management or Capacity Management database. There should also be a consideration made for which configuration items should be included and how individual attributes are defined. A good rule is to never collect information if you aren’t going to do anything with the results.
When it comes to the service desk, you definitely will need and want to build some integrations. Here are the two that I believe are MUST do’s and important considerations for executing them.
Automated Event Management – Over the years, I’ve seen more clients integrate an Event Management System into ITSM. The benefit is to have events automatically detect and trigger system generated incidents that are assigned to the appropriate support personnel with the correct priority, all without human intervention. There are several considerations to make this function effectively because it is crucial to avoid what IT folks call an ‘event storm’. An event storm is when your system is overwhelmed with trivial items, and thus, the differentiation of incidents becomes diluted, making it tougher to discern important incidents from merely “status indicators.” Not every event that an event management system detects should be classified as an incident—an incident record should only be created if it is actionable. Specifically, the creation of the record should lead to a system administrator making a change to resolve the discrepancy. If the administrator reviews the incident and says to herself “There will be at least one more ticket created before the system crashes,” that’s an excellent indication you may be headed toward an event storm. Many of these events are notifications that shouldn’t be fed as incidents into an ITSM system as these items do not meet the ITIL standard. Remember, with ITIL (IT Infrastructure Library) an Incident is defined as “An unplanned interruption to an IT service or a reduction in the quality of an IT Service. Hence, failure of a CI (Configuration Item) that has not yet affected service is also an incident.” Earlier this month, I was on a call with a client and we discussed event sizing. I asked the client for the number of incidents their event management system was expected to feed into their incident module and the client asked me if “I wanted the number of events per second.” ‘’Yikes!” I told him, “If you are able to give me that number of incidents per second, then you may be identifying too many events as incidents and as a result are going to tax the system, the supporting process, and thereby creating an event storm.” He later agreed that most of the events were not truly incidents as defined by ITIL. Automated Event Management is very powerful when done correctly by following the ITIL definition of an incident very closely and not widening that definition arbitrarily.
Automated Discovery of CI’s (Configuration Item) into CMDB – The most beneficial CMDBs are those that have an automated discovery of CIs to populate them. It is important to determine when to run the discovery and when to reconcile and normalize into the CMDB. For instance, running a discovery once a day should be optimal for most environments. It is also worth noting that when you populate the CIs there is no need to repopulate the same information into the CMDB, but instead, populate just the delta from the previous job. When a client tells me that their reconciliation process is taking too long the first place I look for improvement is to see if every CI is being repopulated, or just the deltas.
If you’re like me, and enjoy preventing issues before they start, then this blog hopefully has given you some key items to think about when designing the upgrade for your service desk. Now, I didn’t learn all of this overnight and I did learn from others while working on service desk efforts. Developing an effective service desk design not only takes time, but it also requires working with a broad set of stakeholders with reviews and agreements made along the way. I hope by reading my recommendations here, you can establish a plan forward for your organization.