I am often asked by customers and prospects: “What should we be monitoring?” This is a billion dollar question and it seems like everyone has their own answers. I have seen different approaches, some better than others.
In this blog post I want to share with you what I think is a “good” methodology for monitoring. To illustrate it I will use a simple webpage, but you can easily apply it to web-based applications.
Before we talk about what to monitor, lets quickly cover why you should monitor in the first place:
In a webpage there are multiple components that determine availability, speed and integrity. If we were to dissect a webpage we would see the following:
All of the above rely on HTTP requests to one or more hosts, and the browser executing the HTTP responses properly. If we were to analyze the loading of the webpage we would see all these requests being issued, answered, and executed – and some of them will have a major impact while others might be very limited. So how do we go about monitoring this rich ecosystem?
It is obviously important to monitor your webpage from an actual browser to get a clear picture of the availability, speed, and integrity of a webpage. This will help you answer questions like:
However, while monitoring the webpage on a browser is important, it will not be sufficient. The main reason is that the complexity of the webpage, with its different hosts or requests, brings complexity to troubleshooting. If you rely on third-party vendors and partners then you will want to monitor them independently of each other – to avoid finger-pointing between them.
Therefore we recommend that you not only monitor the webpage itself in a browser, but also individual requests and hosts that have an impact on your webpage. Thus we suggest the following three-step process to identify what to monitor:
Up to now we have described why to monitor, where to monitor, and next we describe what to monitor for each of the requests.
Obviously you have to monitor the HTTP requests, and the webpage or webpage-like content (widget or Ad) performance on the browser.
Additionally we recommend you monitor your DNS Servers or DNS Providers. DNS is often forgotten by individuals, but it is the one thing that can make a huge difference to users from different geographic locations – or to the availability of the requests for your webpage. It is best that you monitor the DNS Servers directly (this is why we have a separate DNS monitoring solution), versus relying on the browser/http monitors. The main reason is that DNS can be resolved by one of multiple servers. If you have two DNS servers resolving your domain and one has a response time of 100ms while the other is 500ms, there would be 1 in 2 chance you see DNS at 500ms – and because of DNS TTL and caching you might have an even harder time discovering DNS performance problems.
Monitor not only the webpage but also the key hosts and requests that impact the performance of your webpage. Don’t be limited to HTTP monitoring; expand to DNS monitoring for the key domains to ensure speed and availability. Don’t just monitor to keep baselines, observe where you can act and save some milliseconds or bytes; reducing delay by 100 milliseconds every release can be very empowering. You are making end users happy and making your company money.
I apologize for the length of the post, but the topic can not deserve less…
Mehdi – Catchpoint