Last week I wrote an article about the really embarrassing issue of not knowing when your website is down – particularly before your boss does. Does this still happen to you?
Have you got a story along these lines? And how did you solve it?
Here is a story that a Charlie Steed, an IT Ops Manager based out of Fresno, emailed in:
An emerging aspect of our organization’s proposition is providing our clients access to real-time trading information through the web channel, so our customers have come to expect 24×7 access. Like many SMEs though, we can’t afford to have a 24×7 operation so we do what we can. About 6 months ago, we kept seeing outages on our site, but the worst of it was that the outages were reported by key customers to my boss, and then it filtered down to me – all this happened before our own equipment noticed [and alerted] the failure! Holy crap! So I had to employ new measures. The quick and easy one was to use an external service that monitored our website continuously and sent me a SMS when it was down – this way we could react quickly. It made all the difference! It also meant I didn’t get the crap kicked out of me by my boss.
This is a familiar story for me – I can recall being in the same precarious position 7 years ago when my employer was growing fast – but too fast for their investment in fault-tolerant infrastructure. The nature of high-value online business propositions is that when the website is down, someone is in for a kicking. If you have a story, then let us hear it! If have a solution you want to share, then let us hear it!
If you’re looking for a top-grade web monitoring solution, then why not take a look at ObservePoint – for free ? Why not monitor your competitors sites too – for free?
Just recently we began proactively monitoring our hardware and applications running in our IT department. http://www.nagios.org is a powerful open source tool for monitoring your servers and apps. We have a dashboard of all servers and applications across all of our locations complete with email and/or SMS alerts and a WAP interface. Gone are the days of end-users notifying us when a server or service goes offline.
-Eric
@Eric – Thanks for tip on the Nagios package, I’ll take a look. How does it compare with commercial products?
To be honest, I’ve never used a commercial package because nagios meets the needs of our monitoring requirements.
I am going to have a look at that Nagios package too as I am looking for an open-source solution. I wonder if anyone else knows of a solid open-source solutions so I can compare them both together?
@Asif – I don’t know of one myself, maybe somebody else does?
I tried that ObservPoint tool you suggested and it was really good. This is a bit of a crowded market, but I think ObservPoint concentrate on the measurement against other things that just % uptime, like SLAs and performance. Thanks for the tip.
@SirKumspect – Good to hear you have a success story with ObservePoint, it is a nifty tool