Even years later, we remember the day when Gmail went down for a part of the day. That rare service failure stood out because Google has an outstanding reputation for service quality and IT uptime. Your company may not offer online services. However, IT uptime is critically important.
Why Does IT Uptime Matter for Your Company?
Poor IT uptime hurts your company in several ways. The extent of the damage depends on the systems impacted.
- Staff Productivity. Imagine if your company intranet went down for half the day. Your staff could not access company documents or databases. They might be able to work temporarily with phones and notebooks, but that kind of temporary fix is not sustainable.
- Customer Service. “Our system is down… Please be patient.” If we had a dollar for all the times that companies have told that to us, we could pay for a nice dinner out! IT uptime failures mean slow customer service at the very least. At worst, you may not be able to serve customers at all if you require access to databases and online tools to access customer profiles and related information.
- IT Department Credibility. As an IT manager, you probably have a long list of projects you want to deliver. To be taken seriously for your innovative ideas, the business needs to see that you can perform on the fundamentals. If there are severe IT uptime problems, your innovation agenda will not go anywhere. If the issues are bad enough, IT managers may be fired from the company.
- Brand Reputation. What happens when failed IT services are visible to the public? Customers, critics, and the news media will not hold back in pointing out failures. Recovering from these incidents requires costly programs, including hiring consultants and providing special programs to customers.
- Inability to support growth and peak demand. Many companies dream of hitting a growth spurt as a result of media coverage. What if you had front page coverage in CNN.com, Fox or TechCrunch? If that flood of new visitors crashes your website, you might lose hundreds or thousands of potential new customers. Such crashes tell the world that you are not ready for significant growth.
Now that we know how IT uptime failures can impact your company, it is time to ask some blunt questions about your vulnerabilities.
IT Manager, Know Thy Department – Assessing Your IT Uptime Performance
Before you look at using containers and other methods to improve IT uptime, consider the big picture. Does your organization have satisfactory IT uptime performance? If so, are you positioned for further growth as your organization takes on additional technology in the future? To assess your IT uptime situation, review the following questions:
- IT Uptime Metrics. What specific quantitative measures do you have to determine if your systems are “up”? Instead of using a simple “up or not” measure, we recommend developing a more nuanced “green, yellow, and red” approach to metrics. This approach will help your staff to detect problems before they become disasters.
- IT Uptime Reporting. As the IT manager, what reports do you receive on a monthly and quarterly basis to understand uptime? Even more important, do these reports come with meaningful commentary that you can act on?
- IT Uptime Coverage. If your organization has ten critical systems, does your IT uptime program fully cover those systems? Make sure that you have coverage of outsourced and third-party systems that you rely on to keep your business running.
- IT Uptime Third Party Review. This is an advanced point to consider if you have a robust IT uptime program in place. Seek out an expert consulting firm or ask your internal audit group to review your IT uptime program. They may point out that your program requires out of date monitoring tools. Alternatively, they may observe that you should report uptime data to senior management or the board.
Resource: To put IT uptime into a broader focus, look at the Uptime Institute’s reports. You may not apply their entire framework, but it gives you an excellent benchmark for evaluating your company. Their recent reports have highlighted that physical risks like fire suppression systems and criminal intrusion remain important issues.
What are your options to improve IT uptime? The easy answer is to ask for a larger budget — buy more servers, hire more support staff and so forth. If the IT uptime is barely fulfilled, this approach makes sense. If you have periodic failures or a concern that you are not set up to scale the business, take a closer look at containers.
Reducing IT Uptime Risk With Containers
Here’s the unpleasant reality. Human error and carelessness are major contributors to IT uptime failures. An administrator forgets to test a new operating system update on a handful of servers. At first, that neglect will be hard to spot. Eventually, a server will go down, and IT managers will have the tough task of tracking down the problem. Containerization helps in the following ways.
- Reduce Operating System Issues. With virtual machines, you have to configure and test the operating system over and over again. Are the test environment and production environment aligned? Are you consistent in your approach to patches? Keeping track of these OS issues is a headache. With containers, you have far fewer operating systems to manage so less potential for uptime crashes.
- Improve Application Performance Problems. By saving time on operating systems management, you will have more time to focus on applications. When an application fails, you will not have to waste time testing whether an operating system issue is a cause.
- Containing Security Problems. By consistently applying identity management to all of your containers, you can have peace of mind that your IAM program has no material gaps.
Speaking of security, make sure that your containerization strategy includes identity management. Without identity monitoring and controls, it is tough to track if the right people are accesses your containers. To make the process easier, use Avatier’s Identity Anywhere.