description
We use a combination of methods to let us monitor the health of our system.

Runbook

Alerting

Alerts appear in our Slack channel.

Health and performance

The health and performance of our system is monitored by a number of dashboards.

Pa11y
Speedtracker
Datastudio
Application bundle size
- Catalogue

Troubleshooting

We run on a roll forward methodology, trying to fix and improve the site rather than roll back.

However, if core functionality is at stake, rolling back is the quickest way to relieve pressure, so that we can roll forward.

Rolling back

We can roll back to any Docker container we have built. You can find the Docker containers publicly hosted on Docker Hub.

To roll back, find the docker container you need*, then run ./deploy/terraform_deploy_service.sh <SERVICE_NAME> <DOCKER_CONTAINER_TAG>.

Once rolled back, we can start diagnosing the problem.

* The process here still needs work, as we have no way of indicating the last docker build we considered healthy.

Diagnosing the problem

If the error was from CloudWatch, we store the logs locally in S3, go into Athena, and run the ALB Logging - create saved query, followed by 5xx errors saved query. You should be able to spot the location of the error from there**.

** These should be saved in terraform, but currently aren't.

Testing

Updown.io

Updown is a tool that consistently monitors specified pages' availabilities and performances. You may access our dashboard here. When a page goes down, an alert is sent in either #wc-platform-alerts or #wc-platform (or both), as well as emails to digital@.

Devs can find more information in the project's README file, on how to add a page, which should be done when a new page type gets released.

Cardigan/Storybook

Our instance of Storybook is called Cardigan, the name tends to be used interchangeably in the team. It is used for two purposes; as a publicly accessible UI library, but also as a testing tool.

Paired with Chromatic (a tool built by Storybook maintainers), we get a build of Cardigan within every branch (that has an open PR). On each commit, Chromatic checks if there are visual differences between the old and new versions of each component, both on the desktop and mobile format. If any changes are found, they need to be approved by the dev, who will get an email alerting them to it.

Those builds can also be used as a shareable link with the designer/anyone who would need to approve changes before a merge.

Pa11y

Pa11y is a package we use for accessibility (a11y) purposes. On every deployment to production, our dashboard is updated with a report on pa11y's accessibility tests against specified pages (see our README to learn how to add pages). A developer who adds a page type should ensure it's added to pa11y and the report comes back clean. Any new issues, at the time of writing, would be raised by the delivery manager on a daily basis.

End-to-end tests

We use Playwright for our e2es, and these run on deployments to staging and production, as well as an optional step in each PR. Three tests will run: Desktop tests, mobile tests and Check URL.

Desktop and mobile tests

These are the same test cases, which we code to account for both. You may find them here.
These tests should test real-world case scenarios from beginning to end, these use cases having ideally been determined upon by a user researcher.

Check URL

Check Url (or url-checker)'s code can be found here. It takes a list of pages and runs several tests against them to check on a few things, such as Javascript errors and requests failures. Any new page type should be added to the list so ensure we're monitoring it.

Unit/Integration testing

We use react-testing-library's jest package to run our unit and integration tests, which can be found in various places in the code and can be run with yarn test:all:unit at the root of the project. Some tests can be found within the component folder, or within the test folder that lives within the common, content and identity repositories.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runbook.md

runbook.md

Runbook

Alerting

Health and performance

Troubleshooting

Rolling back

Diagnosing the problem

Testing

Updown.io

Cardigan/Storybook

Pa11y

End-to-end tests

Desktop and mobile tests

Check URL

Unit/Integration testing

Files

runbook.md

Latest commit

History

runbook.md

File metadata and controls

Runbook

Alerting

Health and performance

Troubleshooting

Rolling back

Diagnosing the problem

Testing

Updown.io

Cardigan/Storybook

Pa11y

End-to-end tests

Desktop and mobile tests

Check URL

Unit/Integration testing