From 4c1612d87da89e47ad52e762b9a3c7533a4aea17 Mon Sep 17 00:00:00 2001 From: Nocnica Mellifera Date: Mon, 23 Dec 2024 16:15:22 -0800 Subject: [PATCH] real index content, ready to review --- site/content/learn/monitoring/_index.md | 13 +- .../synthetic-transaction-monitoring.md | 186 ++++++++++++++++++ 2 files changed, 198 insertions(+), 1 deletion(-) create mode 100644 site/content/learn/monitoring/synthetic-transaction-monitoring.md diff --git a/site/content/learn/monitoring/_index.md b/site/content/learn/monitoring/_index.md index 76b72bf68..8fea0050f 100644 --- a/site/content/learn/monitoring/_index.md +++ b/site/content/learn/monitoring/_index.md @@ -7,4 +7,15 @@ author: Nocnica Mellifera githubUser: serverless-mom displayDescription: --- -This is a temporary placeholder for the Monitoring section of Guides \ No newline at end of file +Monitoring your production services is a key part of a good user experience. Traditional stopgaps like 'pingers' or simply waiting for user reported issues are insufficient. These guides are an introduction to the basic concepts of monitoring. + +## On the Checkly Blog + +
+{{< doc-card class="three-column-card" title="How to Fight Alert Fatigue with Synthetic Monitoring" description="7 best practices." link="https://www.checklyhq.com/blog/alert-fatigue/" >}} + +{{< doc-card class="three-column-card" title="Observability as Code Explained" description="Traditional monitoring has become insufficient for managing complex systems. " link="https://www.checklyhq.com/blog/observability-as-code/" >}} + +{{< doc-card class="three-column-card" title="Software Deployment Best Practices for Modern Engineering Teams" description="Five best practices to help you deploy your software more securely and reliably." link="https://www.checklyhq.com/blog/software-deployment-best-practices/" >}} + +
\ No newline at end of file diff --git a/site/content/learn/monitoring/synthetic-transaction-monitoring.md b/site/content/learn/monitoring/synthetic-transaction-monitoring.md new file mode 100644 index 000000000..022d95356 --- /dev/null +++ b/site/content/learn/monitoring/synthetic-transaction-monitoring.md @@ -0,0 +1,186 @@ +--- +title: Synthetic Transaction Monitoring - Components, Benefits & Challenges +displayTitle: What is Synthetic Transaction Monitoring? +navTitle: Synthetic Transaction Monitoring +description: Explore the what and why of synthetic monitoring. +date: 2024-12-15 +author: Nocnica Mellifera +githubUser: serverless-mom +displayDescription: Explore the what and why of synthetic monitoring. +menu: + learn_monitoring +weight: 30 +--- + +## What is Synthetic Transaction Monitoring? + +Synthetic transaction monitoring is a proactive approach to assessing the performance and reliability of applications by simulating user transactions. These simulations replicate real-world interactions, such as logging into an account, performing a search, or completing a checkout process, allowing organizations to detect issues before they impact actual users. + +Unlike traditional monitoring of a site or service, which relies on observing real user activities, synthetic transaction monitoring generates pre-defined transactions to test specific workflows, ensuring consistent and repeatable measurements. Synthetic transaction monitoring has distinct advantages when compared to Real User Monitoring for frontend applications, and requires work to integrate with other observability tools. + +## Synthetic Monitoring vs. Synthetic Transaction Monitoring + +First some language confusion: ‘Synthetic Monitoring’ isn’t a particularly widely used term, what’s with the addition of ‘transaction?’ [Synthetic monitoring](https://www.checklyhq.com/blog/what-is-synthetic-monitoring/) broadly refers to any monitoring that uses automated tests to emulate users, sometimes a simple as a page load. Synthetic transaction monitoring is a subset focused on multi-step workflows or transactions, such as filling out forms, completing e-commerce purchases, or using APIs. While synthetic monitoring may involve simple checks like testing website uptime, synthetic transaction monitoring goes deeper into the user experience by validating the performance and functionality of complex processes. + +## How Does Synthetic Transaction Monitoring Work? + +Synthetic transaction monitoring involves several key steps: + +1. **Defining Monitoring Requirements**: Product owners, support teams, and product managers define the user paths that are most crucial for the user’s experience. These should be defined as simply as possible. +2. **Script Development**: Engineers or testers write scripts defining the transactions to be monitored. These scripts simulate user actions like clicks, inputs, and navigation. Ideally a modern framework should be used to write these scripts, like [Playwright](https://www.checklyhq.com/learn/playwright/what-is-playwright/). +3. **Test Execution**: A synthetic monitoring engine executes these scripts at regular intervals, either from a single location or multiple geographies. Ideally, test execution should be run on a service well separated from your core services; you don’t want your monitoring to go down at the same time as your site! +4. **Data Collection**: Performance metrics, including response times, success rates, and error codes, are gathered during test runs. +5. **Analysis and Reporting**: Results are analyzed to identify anomalies, trends, or potential bottlenecks. +6. **Alerting**: If thresholds are breached, alerts notify relevant teams for investigation and resolution. For almost all monitoring, alerts should wait on at least one retry, so as not to cause alert fatigue. + +## Why Synthetic Transaction Monitoring Matters + +Synthetic transaction monitoring offers several advantages, especially over real user monitoring. + +- **Proactive Issue Detection**: Real user monitoring fundamentally relies on a user to have a Identifying problems before real users are impacted reduces downtime and enhances user satisfaction. +- **Benchmarking and Trend Analysis**: Regular monitoring helps establish performance baselines and detect deviations over time. +- **Geographic Performance Insights**: Monitoring from various locations ensures applications deliver consistent experiences globally. +- **Validation of Updates**: Testing new features or patches in synthetic environments ensures they function as intended before deployment. + +However, challenges include maintaining up-to-date scripts as applications evolve and ensuring synthetic tests accurately reflect real-world usage. + +## Components of Synthetic Transaction Monitoring + +Effective synthetic transaction monitoring relies on several core components: + +### Synthetic Transaction Engine + +The synthetic transaction engine executes predefined scripts, mimicking user interactions. It must handle complex workflows involving multiple steps, conditional logic, and diverse input data. At Checkly, this is a system of ‘[Checkly runners](https://www.checklyhq.com/docs/monitoring/)’ that execute checks against your service at your selected cadence and geographic regions. + +### Scripting Language + +A flexible scripting language allows testers to define transactions. Languages like JavaScript, Python, or typescript are commonly used, with support for [integrating APIs](https://www.checklyhq.com/blog/what-is-api-monitoring/), [handling cookies](https://www.checklyhq.com/learn/playwright/authentication/), and [managing dynamic content](https://www.checklyhq.com/learn/playwright/iframe-interaction/). + +### Test Execution Scheduler + +The scheduler automates the execution of scripts at specified intervals. This ensures continuous monitoring and provides insights into performance trends over time. At Checkly this is controlled either [via the web UI](https://www.checklyhq.com/docs/testing/) or via our [Monitoring as Code](https://www.checklyhq.com/guides/monitoring-as-code/#) process. + +### Test Results Analyzer + +The analyzer processes collected data, visualizing metrics like response times, error rates, and transaction success rates. It helps teams identify patterns, anomalies, and potential issues. For most use cases test results won’t require a complex analysis since pass-fail standards should be clearly defined in the scripting language. There are some edge cases like [soft assertions in Playwright](https://playwright.dev/docs/test-assertions#soft-assertions), but still a single check should be quickly to categorize as passing or failing. + +### Alerting System + +An alerting system notifies teams when performance thresholds are breached. Alerts can be configured for various conditions, such as increased latency, failed transactions, or regional performance degradation. + +### Integration with Other Tools + +Synthetic transaction monitoring tools often integrate with observability platforms, incident management systems, and CI/CD pipelines, enabling streamlined workflows and comprehensive insights. At Checkly, the results of synthetics checks can integrate with your OpenTelemetry data via [Checkly Traces](https://www.checklyhq.com/docs/traces-open-telemetry/how-it-works/#understand-checkly-traces-and-opentelemetry---checkly-docs). + +## Synthetic Transaction Monitoring Use Cases with Examples + +1. **E-Commerce Checkout Process** +A retailer uses synthetic transaction monitoring to simulate a user adding items to a cart, entering payment details, and completing a purchase. This ensures the checkout process remains functional, even as the underlying datastores and visual polish on the e-commrce experience change. +2. **Banking Application Login** +A financial institution monitors its login functionality by simulating user logins with synthetic scripts. This helps identify authentication delays or errors before customers are affected. +3. **API Performance Monitoring**: +A SaaS company tests its public APIs by simulating client requests. This ensures consistent response times and detects any unexpected behaviors. +4. **Geographic-specific Testing**: +A video streaming service tests its platform from multiple regions to ensure video playback performance meets user expectations globally. +5. **Visual Regression Monitoring** + + A System for monitoring every pixel of a frontend experience. Generally visual regression monitoring is both more brittle and less user-centric than some of the examples mentioned above. After all, your users aren’t logging in to your site expecting certain typefaces or exact color shades. These things do matter, however, so [with Playwright you can monitor for visual regressions](https://www.checklyhq.com/docs/browser-checks/visual-regression-snapshot-testing/#visual-regression--snapshot-testing---checkly-docs) as an advanced test type. + + +Synthetic transaction monitoring is an essential tool for organizations aiming to ensure optimal performance and reliability of their digital services. By proactively addressing issues and gaining insights into user workflows, businesses can deliver superior experiences to their customers. + +## Benefits of Synthetic Transaction Monitoring + +### Early Detection of Issues + +Years ago I was the on-call engineer during a holiday weekend at an ecommerce service. I kept an eye on system performance and I saw good things: latency was low and recorded errors were way down. Unfortunately, it turned out that network config had locked out thousands of users. Our performance was great because most people couldn’t reach the site. While internal measurements of system performance are useful tools for measuring internal efficency, they aren’t great at measuring your users’ experience. The end result is a system where many failure states will only be detected by your users’ reports. + +Synthetic transaction monitoring allows teams to identify problems before users experience them. By simulating user interactions, businesses can proactively detect performance degradation, outages, or errors in critical workflows. This minimizes downtime and helps maintain service reliability. After all, isn’t it better to be able answer every user reported issue with ‘we’re already working on it?’ + +### Proactive Approach to Monitoring + +Rather than waiting for user-reported issues, synthetic monitoring enables a proactive approach by constantly testing systems in real-time. We can also use synthetic transaction monitoring to find trending issues: if the load times for certain actions is slowly increasing, we can identify that problem with synthetics before it causes outright errors for users. This ensures that potential issues are addressed promptly, leading to improved user satisfaction and trust in the platform. + +### Performance Monitoring + +Synthetic monitoring tracks key performance metrics like response time, latency, and availability. By generating consistent and repeatable test scenarios, it provides actionable insights into the health of applications and services under varying conditions. Performance like response time shouldn’t be used as hard limits for passing a test, but these metrics are still generated on synthetics checks, and you can use [Checkly’s ‘degraded state’](https://www.checklyhq.com/docs/browser-checks/degraded-state/) to find user paths that are repsonding slowly without needing to alert teams as if there were an outage. + +### Finding Third-party Changes + +While it’s nice to imagine our service or website as self-contained, albeit with multiple internal microservices, the modern web app has more dependencies than ever, including dependencies on third party services all the way to the front end. A missing npm package or a service outage from Intercom can all cause our user experience to shift in unpredictable ways. Detecting any problems with third-party services before your users do is one of the many benefits of synthetic transaction monitoring. Some useful checks to add: + +- End-to-end testing of key user transaction to make sure, for example, that users can find items, add them to a cart, and check out on your e-commerce service. +- [Visual regression tests](https://www.checklyhq.com/blog/visual-regression-testing-with-playwright/) to make sure third-party javascript add ons haven’t caused problems for your front end experience. +- Synthetics checks pointed directly at the third party services. If responses from the your payment provider are often unreliable, set up a monitor (and an alert) to know as soon as it has a chance to affect your users. + +### Compliance + +Synthetic transaction monitoring can aid in compliance with service-level agreements (SLAs) by consistently measuring uptime and performance against defined benchmarks. This ensures businesses meet contractual obligations and regulatory requirements. You can also use synthetic transaction monitoring to make sure you’re compliant with accesibility requirements, [in Playwright you can use the Axe testing engine to automatically check your pages for accessibility issues](https://www.checklyhq.com/blog/integrating-accessibility-checks-in-playwright-tes/). + +## Challenges of Synthetic Transaction Monitoring + +### Need for Validation + +In the early days of my career I was working on a telephone automation system; I was nearly done when I asked a sales person if our business often received inbound calls. It turned out that since our userbase was older Americans, the majority of our sales were actually made by site visitors calling us on the phone. If I had been designing synthetic monitoring for my service, my lack of knowledge of real user behavior would have made effective monitoring difficult. + +Although synthetic monitoring provides simulated data, it cannot replace actual user feedback. Synthetic tests assume predefined workflows and may miss issues that occur in real-world usage scenarios. Either by directly observing users interacting with your service, or detailed analytics, you need to see how users *actually* use your service. + +### Core Problem Analysis + +When [retries are set up correctly](https://www.checklyhq.com/docs/alerting-and-retries/), we should only get a synthetic transaction monitoring alert when there’s a real problem that users will notice. But detecting an issue through synthetic monitoring is only the first step. Pinpointing the root cause of problems requires additional diagnostic tools and processes. Synthetic monitoring should be run as a separate service from all of your internal services and tools, and as such will lack insight into how your back end is handling requests. Once you’ve detected a problem, backend logs, traces, and other information from something like an [OpenTelemetry monitoring system](https://www.checklyhq.com/learn/opentelemetry/getting-started-with-observability/), will be needed to find a root cause. + +### Tool Integration Needs + +Integrating synthetic monitoring tools with existing systems can pose challenges, especially if the ecosystem includes multiple platforms and technologies. Seamless integration is crucial for accurate monitoring and efficient workflows, but it often requires customization and ongoing maintenance. As ‘table stakes’ your synthetic transaction monitoring tool should make its results available on a shared dashboqard, and route alerts intelligently either natively or through integration with tools like [Rootly](https://www.checklyhq.com/docs/integrations/rootly/) and [Pagerduty](https://www.checklyhq.com/docs/integrations/pagerduty/#send-alerts-to-pagerduty-with-checkly---checkly-docs). Ideally your synthetic monitoring could automatically connect back end tracing that was started by a synthetics request. With [Checkly Traces](https://www.checklyhq.com/docs/traces-open-telemetry/how-it-works/#understand-checkly-traces-and-opentelemetry---checkly-docs), you can do just that: integrate back end traces with synthetic transaction monitoring, adding insight into the root cause of errors. + +By understanding these benefits and challenges, organizations can better implement synthetic transaction monitoring as part of a balanced observability strategy. + +## Best Practices for Synthetic Transaction Monitoring + +### Defining Clear Objectives + +Synthetic transaction monitoring begins with a clear understanding of what your users expect. You also need to define the retry and cadence expectations, starting with your SLA with your users. + +### Crafting Realistic Test Scenarios + +Design test scenarios that mimic real-world user interactions as closely as possible. This includes simulating different devices, browsers, and network conditions to ensure comprehensive coverage. In playwright, use the [user agent](https://www.checklyhq.com/learn/playwright/challenging-flows/#bot-detection) and [viewport](https://www.checklyhq.com/learn/playwright/taking-screenshots/#generating-and-saving-screenshots) settings to simulate different scenarios. + +### Prioritizing Business-Critical Transactions + +Focus monitoring efforts on transactions that directly impact your business, such as login flows, checkout processes, or API integrations. Prioritizing these ensures you catch issues that could significantly affect your customers and revenue. This matters not just for the infrastructure costs of running synthetic transaction monitoring, but alert fatigue for your operations or SRE teams. + +### Balancing Test Frequency and Resources + +Set a monitoring cadence that balances timely issue detection with efficient use of resources. High-frequency tests may be necessary for critical applications, but over-monitoring can lead to unnecessary costs. On Checkly, make sure your geographic locations make sense for your userbase. + +### Setting Actionable Alert Thresholds + +Define alert thresholds that reflect acceptable performance levels for your business. Avoid overly sensitive thresholds that generate false positives, and ensure alerts are actionable by including relevant context for troubleshooting. Avoid hard waits and other rigid standards that result in brittle testing. + +### Regular Review and Update of Test Scripts and Golden Files + +Test scripts should evolve alongside your application. Regularly review and update them to account for new features, UI changes, or updates to external dependencies. This helps maintain the relevance and accuracy of your monitoring. You’ll also maintain a set of ‘golden files’ that define ideal responses, either as stored responses (for example a saved JSON response from an API) or screenshots where everything is loading correctly. With Checkly, you can [update these golden files with a single command from the Checkly CLI](https://www.checklyhq.com/docs/cli/). + +### Leveraging Integration with Observability Tools + +Integrate synthetic monitoring with broader observability platforms to enhance troubleshooting and root cause analysis. Connections to logging, tracing, and metrics tools can help paint a complete picture of your application’s health. + +### Using Synthetic Monitoring for SLA Validation + +Synthetic monitoring is a powerful tool for validating Service Level Agreements (SLAs). By continuously testing key metrics, you can provide objective evidence of SLA adherence to stakeholders. + +## Synthetic Transaction Monitoring Tools + +Numerous tools are available to implement synthetic transaction monitoring effectively. Solutions like Catchpoint and Dynatrace offer advanced features such as global test locations, detailed reporting, and integration with other observability tools. Open-source options, such as those leveraging CNCF projects, can provide flexibility and customization for Kubernetes-based environments. + +### Checkly’s Role in Synthetic Monitoring + +[Checkly](https://www.checklyhq.com/product/synthetic-monitoring/) is a purpose-built platform for synthetic monitoring that simplifies the creation and management of test scripts. With features like a developer-friendly CLI, seamless integration with CI/CD pipelines, and out-of-the-box support for the multi-language Playwright automation framework. Checkly enables teams to monitor critical user journeys effectively. Its dashboards and alerting systems ensure you’re always aware of issues before they impact customers. + +## How Checkly Can Help with Synthetic Transaction Monitoring + +Checkly provides a robust set of tools to implement synthetic monitoring seamlessly. Its scripting capabilities allow you to craft realistic user scenarios while its cloud-based infrastructure ensures consistent monitoring from multiple geographies. Integration with observability platforms and APIs enables you to incorporate Checkly into your existing workflows, making it an efficient choice for synthetic monitoring needs. + +## Conclusion + +Synthetic transaction monitoring is a critical practice for ensuring application reliability and user satisfaction. By following best practices and leveraging tools like Checkly, organizations can gain actionable insights and proactively address performance issues. Combining synthetic monitoring with real user data and broader observability efforts enhances its value, enabling teams to maintain high service quality and meet business objectives. \ No newline at end of file