For many in-house IT departments, it can be incredibly tempting to create a DIY IT infrastructure monitoring solution. The truth is, a DIY monitoring solution isn’t usually formally planned and implemented—instead, it usually starts in one department, monitoring one set of equipment or applications, and then slowly evolves as your business requirements dictate. It is usually the responsibility of a small number of people (if not one person) and consumes an out-sized portion of their (highly valuable) time.
In many organizations, the monitoring tools deployed will differ from one department to the next, with different individuals responsible for each. When an outage occurs, no one really knows where to look for resolution. What initially started as a well-intentioned project to implement a solution to monitor your IT infratructure can quickly turn into a constraint on your organization, due to growth and complexity in your network. Whether you’re trying to save on budget or wanting to recapture and optimize your internal resources, what’s the real cost of building and maintaining an IT monitoring solution at scale?
The cost of hardware, software, and recurring software maintenance is just the tip of the iceberg. When it comes to the real cost of DIY IT Infrastructure monitoring, there’s so much more to consider, including:
A lot of DIY solutions rely on having staff members on-call to respond to alerts after-hours. Which sounds great on paper, but on-call solutions come with slow response times, poor performance from groggy engineers the next day, and a poor work/life balance for your employees. Doing an on-call rotation affects more than just your employee’s free time—it impacts their family too. Many engineers also wind up with more than just one night on-call—it could be three or four nights, or even up to a week or more before they can hand it off to the next person, thereby multiplying the negative effects with each day. This is an avoidable source of job dissatisfaction.
In order to avoid on-call burnout and provide coverage for true 24×7 monitoring, you would need to dedicate at least seven employees. That’s an added cost of not just wages, but benefits, equipment, training, and more. An outsourced monitoring services partner keeps an eye on your network 24×7 for a fraction of the cost to do it yourself. With a monitoring services partner, your employees don’t get burnt out and can focus on value-adding projects every day without feeling worn out from nights, weekends, and holidays responding to alerts.
DIY monitoring solutions can generate a lot of nuisance alarms. These are alarms which alert and clear without needing to be addressed, or are very low priority and don’t require immediate attention. And it’s never just one alert at a time—it’s 30 alerts coming in because 30 devices are connected to common circuit or switch.
If your staff is continuously pestered with alarms that don’t need to be immediately dealt with, they could become indifferent or apathetic to them. They’ll learn to start ignoring them and go on about their other work duties or personal activities. This means the likelihood an engineer could miss an actionable alarm goes up significantly.
If your staff is required to respond to every alert, it can take a real toll. When an engineer keeps getting up throughout the night to respond to mostly false alerts, they’re not getting the sleep they need to be at peak performance. By the time they are on their third night of their on-call rotation, they could easily sleep through alerts—the pager/smartphone going off and nobody awake to hear it.
Outsourcing your monitoring to a partner that specializes in custom solutions will offer you features to eliminate false alerts and reduce the noise—something you simply don’t get with DIY or out-of-the-box monitoring software or solutions. For example, your NOC partner can put a delay on alerts or apply a time-based threshold to reduce the number of false positive alerts. A NOC partner can take it one step further and work with you to root out the causes of false alerts and fix them, rather than just masking them.
Skills & Training Investment
Your engineers are probably capable of setting up and running an out-of-the-box monitoring software platform without much trouble. But that responsibility typically falls on one person or a small team, so even though your network is in capable hands, it becomes a specialized skill set to keep up with updates and continually tweak the software to satisfy your needs. That specialization comes at a price.
Since only a select number of people in your organization will be well-versed on the solution, they’ll likely get pulled away from their other projects anytime something comes up needing their attention. If they leave for another job opportunity, get sick, or take a vacation, who will you have in place to step in and take over the monitoring platform? Your organization must make the costly choice to either train additional engineers or be willing to wait it out until that key person comes back.
With today’s hybrid environments, the skills necessary to monitor those networks go beyond simple Ping and SNMP polls. In a cloud environment, you may be better served by API-based polling. For example, AWS offers CloudWatch and CloudTrail. On Azure, it might be Microsoft Insights or Applications Insights. Will your team have expertise in these APIs to be able to utilize them effectively?
Lack of Customization
As the sayings go, square pegs don’t fit round holes and one size doesn’t fit all. Your commercial software platform may monitor your IT infrastructure wonderfully when you first set it up. But you need to have a platform that’s scalable and flexible enough to evolve as your needs change.
Delayed paging, time-based thresholds, and classes of service are flexible features that your NOC service provider can offer that might not be as easily done with an out-of-the-box solution. If those features are available in a commercial solution, chances are they could cost you more to add-on those capabilities, whereas they’re built into the platform provided by an outsourced NOC partner.
One Solution for All
Often, each department within an organization operates their own monitoring platform. There’s no consistency across those silos, and so when outages occur, there’s often confusion around where to look and who is responsible. You check one tool for the network, two others for servers and yet one more for applications. That kind of tool sprawl—multiple tools all doing the same thing—is not just a waste of money, but takes up precious time to compare the different platforms and find out what’s going on whenever there is an issue.
Your Weakest Link
If your internal monitoring system goes down or loses its ability to send emails, will you know? If the hard drive running your monitoring software fails, do you have a backup so you can restore it? Are you monitoring your monitoring? If alerts suddenly stop, you have to have confidence you’ll know about it. If you have a major outage and don’t know about it, you could end up paying for it in the long-run through lost opportunity costs and customer confidence. That’s especially true if you experience an outage after-hours or on a holiday and no one is in the office to catch it.
If your professional NOC partner loses the ability to communicate with their on-site equipment or complete their monitoring processes, they will notify you immediately. Your NOC partner will also backup their monitoring configuration so if something does fail, they can get you back up and running without missing a beat.
At the end of the day, outsourced NOC partners come with fresh eyes to monitor your environment 24×7 and can leverage their expertise with new ways to monitor your infrastructure you might not have considered before. Avoid alert fatigue, focus your staff’s attention on other value-adding projects, and eliminate the high costs of DIY monitoring while remaining confident your IT infrastructure is operating at peak performance.