- Chaperone
- Features
- Main Concepts
- Example Usage
- Tips
- Template Checks
- Interval and Schedule Configuration
- Script writing process and debugging
- Contributions
Chaperone is a monitoring application intended to be deployed as a docker container.
- Execution of arbitrary checks configured with cron or interval (e.g.
30s
,5m
8h
) syntax. - Execute commands directly in a check, or call out to your own scripts.
- Simple configuration using TOML files.
- Configurable output destinations for check results. e.g. stdout, InfluxDB, and Slack.
Each check is a TOML file that looks like this:
name = "basic example"
description = "basic example showing how to run a command/script"
command = "ls | head -n 1" # the command exit code is used to determine status. 0 = OK, anything else = FAIL
interval = "1m" # the command will run every minute. alternatively, you can configure a cron schedule.
timeout = "5s" # the maximum amount of time the check can run before being canceled/failed
tags = {env="dev"} # optional - tags let you categorize the output in tools like InfluxDB/Grafana
debug = true # optional - defaults to false. If set to true, this logs the commands as they're run.
The command just needs to be executable, so the sky's the limit. Add any apps or scripts to the app that you want and call them. Like bash, curl, and jq? You're already covered. Prefer Python or Kotlin Script? Just add them to your docker container and call them instead.
You might not even need to call a script file. For example, if you want a basic HTTP health check, try this:
command = '''[[ $(curl -sL -w '%{http_code}' -o /dev/null 'https://httpbin.org/status/200') == "200" ]]'''
For those unfamiliar with curl and bash, this makes an HTTP call and validates that the response code was 200.
For those new to TOML, the triple-ticks indicate a literal string, which lets us use single and double quotes in
the command without having to escape them. This is why we use TOML and not JSON or YAML.
You place all your checks in a directory, and when the app starts up, it runs each check on its schedule or interval.
Where the results of your checks go. A checks result consists of its status (OK or FAIL), and any output from the command. You can pick and choose from these output destinations in the global config file. e.g.:
[outputs.log]
destination="stdout" # options are stdout or a file path. defaults to stdout
format="logstash" # options are pretty or logstash. defaults to pretty
[outputs.influxdb]
db="metrics"
defaultTags={app="myproject-chaperone"} # optional tags applied to all your checks
uri="http://localhost:8086"
[outputs.slack]
webhook="https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"
onlyWriteFailures=true # optional - defaults to false. useful when you only want message for check failures
[outputs.command]
workingDirectory="/usr/local/bin" # required - the working directory to run the command from
command="./my-custom-command.sh" # required - command
onlyWriteFailures=true # optional - defaults to false.
See the docs on output writers for more details.
The base image uses ubuntu. We publish the latest
tag with each release, but don't recommend you use it.
Please use a specific version and stay updated with the latest stable release. This gives you
version control, and eliminates instability and debugging issues that could arise from
unknown changes. The latest
tag should only be used for testing.
See the example usage directory for an example project, with some sample checks and output configuration. Take that example, then replace its checks with whatever you want to do.
- Each check's command is executed from the directory of where that check is defined, so you can use relative paths in the command definition. This makes it easier to test your command locally in a terminal, then copy/paste the command into a check config when it's ready.
- The
checks.d
directory can contain subdirectories of files. Especially for more complicated checks that might have supporting files used by the check, a suggested organizational starting point is to have a directory per check, like this:
checks.d/
check-a/
scripta1.sh
scripta2
check-b/
scriptb.sh
etc...
In simple mode, you call a single script, and it returns a single result. That's fine for simple stuff, but you'll want to read up on template checks to create multiple related checks.Template Checks.
You configure each check to run on an interval
or a schedule
. You must choose one, and you can't choose both.
By setting an interval
in your check configuration, it will immediately run when chaperone is started, and then
run every XT interval after that, where X is the interval value and T is the time unit of measure.
The time unit of measurement can be one of s
(seconds), m
(minutes), h
(hours), or even d
(days).
For example, interval = "10m"
will run the check every 10 minutes.
If you configure your check with a schedule
, then
UNIX crontab Convention is supported.
The syntax is a string containing 5 fields, where each field represents in this order:
- minute
- hour
- day of month
- month
- day of week
For example, to run every day, five minutes after midnight:
schedule = "5 0 * * *"
Limitation: It doesn't support special string values like "@hourly" or "@daily". It should support everything
else though, like wildcards (*
), ranges (2,4
), lists (1-5
), and step values (*/2
).
As mentioned earlier, get your scripts running first via command line or unit test, and then configure your check TOML. If a check isn't behaving as expected, you have a couple debugging options:
- In an individual check's TOML config, you can set
debug = true
. Doing this sets the bashx
flag when calling the script, causing it to output variable values as the script is being evaluated. - You can also set the
CHAPERONE_LOG_LEVEL
environment variable to a value ofDEBUG
, which will output more information to the log destination as each check is called.
The app is written in Kotlin, and uses the standard kotlin code format. Questions, comments, and pull requests welcome. See Development.md for some docs on developing locally.