Synthetic Monitoring is a methodology of actively probing servers, services, and applications by simulating an end user. Blackbox, aka Active, aka Proactive, aka Synthetic methodology emulates an action and performs a functional test without really worrying about the internal inner workings, very much like an end user (“I am here to buy a plane ticket, I do not care about what servers you use”).
So whether it’s a DNS query, a web transaction, an API test, an SSL handshake, etc., Synthetic Monitoring is capable of telling you if it’s working or not working, fast or slow, normal or abnormal, based on a baseline. Google defines blackbox monitoring as, “Testing externally visible behavior as a user would see it…blackbox monitoring is symptom-oriented and represents active—not predicted—problems: The system isn’t working correctly, right now.” Running synthetic tests helps us detect variability, deterioration, malfunctions, (i.e. PROBLEMS) before it’s too late.
What does too late mean? It means you’ve already annoyed your customers and wasted their time.
There are many reasons to rely on Synthetic methodology: baselining, benchmarking, SLA monitoring, CDN steering, alerting, third-party validation, ISP intelligence, new market validation, cloud migration, DDoS detection, SLM compliance, web optimization…the list goes on.
But at its core, a good synthetic monitoring solution should be able to help you answer four questions around Reachability, Availability, Performance, and Reliability. These are the four key areas of any Digital Experience Monitoring practice.
I have always loved this cartoon. Everyone is talking about customer experiences and how important it is to put your customers first. So do you really want to use them as lab rats? Because that is what you’re doing if you rely entirely on Real User Monitoring (i.e. passive monitoring) to understand the customer experience.
It’s true that RUM is the only way to truly understand what end users are experiencing, which is why we’ve invested heavily in building a first-rate RUM solution that offers custom metrics, and worked with Google to be the first to implement NEL and doesn’t rely on sampling. But why wait until AFTER their experience has been ruined before doing something about it? Besides, if your users can’t reach your service or load your library, then you’re not collecting RUM data anyway.
Deploying a Synthetic Monitoring solution is the only way to be a truly customer-centric company.
All APM vendors today (Dynatrace, NewRelic, Datadog, AppD) have something they call synthetic, but it’s only running from cloud providers such as AWS, Azure, and Google Cloud. It’s a pure convenience play. It’s cheaper and easier – that’s all. And it will always be about cost.
There’s certainly merit to cloud-based monitoring. Catchpoint has over 150 nodes hosted with cloud providers (Azure, AWS, Google, Oracle, IBM, Alicloud, Tencent), so we’re well aware that it’s cheaper to operate them (and we are passing those cost savings to our customers).
But it’s not better. It does not provide cleaner, more powerful, or more stable data (in fact, dealing with noisy neighbors is a big problem in the cloud). The only real advantage to cloud-based monitoring is that it’s more cost-effective for the monitoring vendor.
Who the hell wants to deal with providers in Mexico or India or China that take three months to set up servers and circuits? Trust me, we know from having set up over 100 nodes on the backbone, broadband, and wireless providers in China that it is a laborious, almost a Herculean task.
But that does not mean you throw in the towel and go the easy route, and THEN try to make your customers believe it’s the best thing for them! After all, if those nodes are so meaningless, why would you spend over $700M dollars buying Gomez and Keynote in the first place? Oops?
These providers try to justify offering a lesser product by downplaying the need to have visibility into ISP performance. But lack of ISP visibility means lack of routing, network, and BGP information as well – all critical services that really matter to your customers.
Just look at what happened in the last few weeks when a BGP leak impacted Cloudflare, or when Google Cloud went down. Do you want to wait until your customers are pissed at you before doing something, or would you prefer to know about it quickly and execute failover systems that you’ve put in place for just such an occasion?
Since when deploying monitoring systems became about making your numbers better? For me, it was and will always be about uncovering issues and fixing them. We conducted an interesting study around Synthetic Monitoring from the Cloud vs Backbone and Broadband Nodes.
I recently asked a customer about this move to exclusively cloud-hosted monitoring agents. His answer:
“This feels like a very large gap for me. If I am hosting in the cloud, why monitor from there? Cross-cloud? My customers’ users are not in the cloud, so why is that a valid vantage point, especially for SLA monitoring?
Also, for cloud-based agents, do you see issues with noisy neighbors affecting the accuracy of monitoring results from those nodes hosted on the cloud? Does VM isolation matter?”
On top of this, why limit synthetic to just web stuff? The internet is far more complex than that; it relies on various services (DNS, CDNs, APIs, BGP, etc.). Again, synthetic monitoring is not a tool, it’s a methodology. That’s why our philosophy has always been to expand the use cases of Synthetic to any type of service that end users rely on. Our motto was and still is: MONITOR AS MANY THINGS FROM AS MANY PLACES AS POSSIBLE. After all, our job is to help customers detect a problem, identify the problem, escalate to the proper channels, fix the problem, and validate that things are working again.
The choice is yours. You can choose Evidence-Based Monitoring or Convenience-Based Monitoring. Just remember that the convenience benefits lie entirely with the vendor – not with you, and certainly not with your customers.