-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
133/prometheus #173
133/prometheus #173
Changes from 8 commits
069c742
417706f
f997a1b
af01aad
6eabfcb
2d0d331
4d93ebd
af3e18c
92287b8
db96856
19664fd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,7 +14,6 @@ the relayer works, but the Quick Start probably gives a better intro. | |
- RPC addresses of 2 full nodes on compatible, IBC-enabled chains | ||
- See [Chain Requirements below](#Chain-Requirements) for details of what chains are supported | ||
|
||
|
||
## Installation | ||
|
||
### NPM | ||
|
@@ -66,7 +65,7 @@ Reads the configuration and starts relaying packets. | |
- pulls default `registry.yaml` to relayer's home | ||
- funds addresses on both sides so relayer can pay the fee while relaying packets | ||
|
||
> **NOTE:** Test blockchains `relayer_test_1` and `relayer_test_2` are running in the public. You do not need to start any blockchain locally to complete the quick start guide. | ||
> **NOTE:** Test blockchains `relayer_test_1` and `relayer_test_2` are running in the public. You do not need to start any blockchain locally to complete the quick start guide. | ||
|
||
> **NOTE:** Run `ibc-setup balances` to see the amount of tokens on each address. | ||
|
||
|
@@ -88,15 +87,16 @@ Reads the configuration and starts relaying packets. | |
### Send tokens between chains | ||
|
||
1. Make sure `wasmd` binary is installed on your system | ||
|
||
- you must be running Linux or OSX on amd64 (not arm64/Mac M1) | ||
- [install Go 1.15+](https://golang.org/doc/install) and ensure that `$PATH` includes Go binaries (you may need to restart your terminal session) | ||
- clone and install `wasmd`: | ||
```sh | ||
git clone https://github.com/CosmWasm/wasmd.git | ||
cd wasmd | ||
git checkout v0.15.1 | ||
make install | ||
``` | ||
- clone and install `wasmd`: | ||
```sh | ||
git clone https://github.com/CosmWasm/wasmd.git | ||
cd wasmd | ||
git checkout v0.15.1 | ||
make install | ||
``` | ||
|
||
2. Create a new account and fund it | ||
|
||
|
@@ -125,6 +125,7 @@ Reads the configuration and starts relaying packets. | |
## Configuration overview | ||
|
||
The relayer configuration is stored under relayer's home directory. By default, it's located at `$HOME/.ibc-setup`, however, can be customized with `home` option, e.g.: | ||
|
||
```sh | ||
# initialize the configuration at /home/user/relayer_custom_home | ||
ibc-setup init --home /home/user/relayer_custom_home | ||
|
@@ -136,19 +137,45 @@ ibc-relayer start --home /home/user/relayer_custom_home | |
There are 3 files that live in the relayer's home. | ||
|
||
- **registry.yaml** (required) | ||
Contains a list of available chains with corresponding information. The chains from the registry can be referenced by `ibc-setup` binary or within the `app.yaml` file. [View an example of registry.yaml file.](demo/registry.yaml) | ||
|
||
Contains a list of available chains with corresponding information. The chains from the registry can be referenced by `ibc-setup` binary or within the `app.yaml` file. [View an example of registry.yaml file.](demo/registry.yaml) | ||
|
||
- **app.yaml** (optional) | ||
|
||
Holds the relayer-specific options such as source or destination chains. These options can be overridden with CLI flags or environment variables. | ||
|
||
Holds the relayer-specific options such as source or destination chains. These options can be overridden with CLI flags or environment variables. | ||
|
||
- **last-queried-heights.json** (optional) | ||
|
||
Stores last queried heights for better performance on relayer startup. It's constantly overwritten with new heights when relayer is running. Simply delete this file to scan the events since forever. | ||
|
||
[Learn more about configuration.](spec/config.md) | ||
|
||
## Monitoring | ||
|
||
The relayer collects various metrics that a [Prometheus](https://prometheus.io/docs/introduction/overview/) instance can consume. | ||
|
||
To enable metrics collection, pass the `--enable-metrics` flag when starting the relayer: | ||
|
||
```sh | ||
ibc-relayer start --enable-metrics | ||
``` | ||
|
||
> **NOTE:** Metrics can also be enabled via an environment variable `RELAYER_ENABLE_METRICS=true`, or with an `enableMetrics: true` entry in the `app.yaml` file, as explained in the [config specification](./spec/config.md#configuration). | ||
|
||
The `GET /metrics` endpoint will be exposed by default on port `26660`, which you can override with `--metrics-port` flag, `RELAYER_METRICS_PORT` env variable, or `metricsPort` entry in `app.yaml`. | ||
|
||
### Local setup | ||
|
||
1. Start the relayer with metrics enabled | ||
2. Spin up the Prometheus instance: | ||
```sh | ||
docker run -it -v $(pwd):/prometheus -p9090:9090 prom/prometheus --config.file=prometheus.yaml | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very cool, a default prometheus config committed in repo. Better than any docs. |
||
``` | ||
> **NOTE:** Ensure that `the --config.file=<path>` flag points at the existing configuration file. You can find an example here: [prometheus.yaml](prometheus.yaml). | ||
3. Open the Prometheus dashboard in a browser at [http://localhost:9090](http://localhost:9090) | ||
|
||
## Development | ||
|
||
[Refer to the development page.](DEVELOPMENT.md) | ||
|
||
## Chain Requirements | ||
|
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
@@ -0,0 +1,10 @@ | ||||
global: | ||||
scrape_interval: 15s | ||||
evaluation_interval: 15s | ||||
|
||||
scrape_configs: | ||||
- job_name: confio_relayer | ||||
|
||||
scrape_interval: 5s | ||||
static_configs: | ||||
- targets: ['host.docker.internal:26660'] | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an example configuration file. Is this the right place to store it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good to commit it to this repo, so it is maintained. You could put it in a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I realized that
One way to "fix" this would be to use parametrized link with version: `https://raw.githubusercontent.com/confio/ts-relayer/tree/v${packageJson.version}/demo/registry.yaml` But it fixes the future versions only. Everything till 👉 For now, I moved There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And let's keep it in demo for now (marked deprecated), and make the main copy in another directory (final location) I am not sure about fixing to a version. We can think about a better location, but nice to be able to roll out registry without requiring upgrades |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,7 @@ import { resolveHomeOption } from '../../utils/options/shared/resolve-home-optio | |
import { resolveKeyFileOption } from '../../utils/options/shared/resolve-key-file-option'; | ||
import { resolveMnemonicOption } from '../../utils/options/shared/resolve-mnemonic-option'; | ||
import { signingClient } from '../../utils/signing-client'; | ||
import { Metrics, setupPrometheus } from '../setup-prometheus'; | ||
|
||
type ResolveHeightsParams = { | ||
scanFromSrc: number | null; | ||
|
@@ -67,53 +68,57 @@ function resolveHeights( | |
} | ||
|
||
type LoopFlags = { | ||
poll?: string; | ||
maxAgeSrc?: string; | ||
maxAgeDest?: string; | ||
once: boolean; | ||
readonly poll?: string; | ||
readonly maxAgeSrc?: string; | ||
readonly maxAgeDest?: string; | ||
readonly once: boolean; | ||
}; | ||
type LoopOptions = { | ||
// how many seconds we sleep between relaying batches | ||
poll: number; | ||
readonly poll: number; | ||
// number of seconds old the client on chain A can be | ||
maxAgeSrc: number; | ||
readonly maxAgeSrc: number; | ||
// number of seconds old the client on chain B can be | ||
maxAgeDest: number; | ||
readonly maxAgeDest: number; | ||
// if set to 'true' quit after one pass | ||
once: boolean; | ||
readonly once: boolean; | ||
}; | ||
|
||
type Flags = { | ||
interactive: boolean; | ||
home?: string; | ||
src?: string; | ||
dest?: string; | ||
keyFile?: string; | ||
mnemonic?: string; | ||
srcConnection?: string; | ||
destConnection?: string; | ||
scanFromSrc?: string; | ||
scanFromDest?: string; | ||
readonly interactive: boolean; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will was here 😄 |
||
readonly home?: string; | ||
readonly src?: string; | ||
readonly dest?: string; | ||
readonly keyFile?: string; | ||
readonly mnemonic?: string; | ||
readonly srcConnection?: string; | ||
readonly destConnection?: string; | ||
readonly scanFromSrc?: string; | ||
readonly scanFromDest?: string; | ||
readonly enableMetrics: boolean; | ||
readonly metricsPort?: string; | ||
} & LoggerFlags & | ||
LoopFlags; | ||
|
||
type Options = { | ||
home: string; | ||
src: string; | ||
dest: string; | ||
mnemonic: string; | ||
srcConnection: string; | ||
destConnection: string; | ||
heights: RelayedHeights | null; | ||
readonly home: string; | ||
readonly src: string; | ||
readonly dest: string; | ||
readonly mnemonic: string; | ||
readonly srcConnection: string; | ||
readonly destConnection: string; | ||
readonly heights: RelayedHeights | null; | ||
readonly enableMetrics: boolean; | ||
readonly metricsPort: number; | ||
} & LoopOptions; | ||
|
||
// some defaults for looping | ||
export const defaults = { | ||
// check once per minute | ||
poll: 60, | ||
// once per day: 86400s | ||
maxAgeSrc: 86400, | ||
maxAgeDest: 86400, | ||
metricsPort: 26660, | ||
}; | ||
|
||
export async function start(flags: Flags, logger: Logger) { | ||
|
@@ -171,6 +176,20 @@ export async function start(flags: Flags, logger: Logger) { | |
flags.scanFromDest, | ||
process.env.RELAYER_SCAN_FROM_DEST | ||
); | ||
const enableMetrics = | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why don't you use Ah, there is no support for boolean, only string and integer? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly! |
||
flags.enableMetrics || | ||
Boolean(process.env.RELAYER_ENABLE_METRICS) || | ||
app?.enableMetrics || | ||
false; | ||
const metricsPort = resolveOption('metricsPort', { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very nice. |
||
integer: true, | ||
required: true, | ||
})( | ||
flags.metricsPort, | ||
process.env.RELAYER_METRICS_PORT, | ||
app?.metricsPort, | ||
defaults.metricsPort | ||
); | ||
|
||
const heights = resolveHeights({ scanFromSrc, scanFromDest, home }, logger); | ||
|
||
|
@@ -189,12 +208,20 @@ export async function start(flags: Flags, logger: Logger) { | |
maxAgeDest, | ||
once, | ||
heights, | ||
enableMetrics, | ||
metricsPort, | ||
}; | ||
|
||
await run(options, logger); | ||
} | ||
|
||
async function run(options: Options, logger: Logger) { | ||
const metrics = setupPrometheus({ | ||
enabled: options.enableMetrics, | ||
port: options.metricsPort, | ||
logger, | ||
}); | ||
|
||
const registryFilePath = path.join(options.home, registryFile); | ||
const { chains } = loadAndValidateRegistry(registryFilePath); | ||
const srcChain = chains[options.src]; | ||
|
@@ -224,10 +251,15 @@ async function run(options: Options, logger: Logger) { | |
logger | ||
); | ||
|
||
await relayerLoop(link, options, logger); | ||
await relayerLoop(link, options, logger, metrics); | ||
} | ||
|
||
async function relayerLoop(link: Link, options: Options, logger: Logger) { | ||
async function relayerLoop( | ||
link: Link, | ||
options: Options, | ||
logger: Logger, | ||
metrics: Metrics | ||
) { | ||
let nextRelay = options.heights ?? {}; | ||
const lastQueriedHeightsFilePath = path.join( | ||
options.home, | ||
|
@@ -262,5 +294,7 @@ async function relayerLoop(link: Link, options: Options, logger: Logger) { | |
logger.info(`Sleeping ${options.poll} seconds...`); | ||
await sleep(options.poll * 1000); | ||
logger.info('... waking up and checking for packets!'); | ||
|
||
metrics?.pollCounter?.inc(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With nullish coalescing operator, we can collect metrics nicely w/o condition blocks (if metrics are disabled). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is neat, but I find it less clear than an |
||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
import http from 'http'; | ||
|
||
import client from 'prom-client'; | ||
|
||
import { Logger } from '../create-logger'; | ||
|
||
let initialized = false; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, do we need some kind of global state? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's just nice to have to make sure we don't initialize the HTTP server more than once: https://github.com/confio/ts-relayer/pull/173/files#diff-95ad111838746b4a0030c8e37ff66c5de9df11b1a2eab5aa3f59895c9f70016cR28-R32 I thought of implementing a singleton here but since it takes arguments like |
||
|
||
function getMetrics() { | ||
return { | ||
pollCounter: new client.Counter({ | ||
name: 'poll_counter', | ||
help: 'Poll counter', | ||
}), | ||
Comment on lines
+11
to
+14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is only an example of a custom metric that should be removed. Let's replace this with some real ones. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very good start! Counter makes sense here for this metric. 👍 Some notes for a future version:
With a common prefix we can find all relayer metrics easily. Otherwise some context is missing as a prometheus instance is likely not exclusive to the relayer. No idea if the relayer is always bidirectional but as this would involve 2 chains the semantics of poll is not fully clear to me. You can use Naming is always hard. You don't need to carry the type inside the name. Maybe "_count" or "_total" is a better fit. I always found this page very helpful: https://prometheus.io/docs/practices/naming/ |
||
}; | ||
} | ||
|
||
export type Metrics = { | ||
pollCounter: client.Counter<string>; | ||
} | null; | ||
export function setupPrometheus({ | ||
enabled, | ||
port, | ||
logger, | ||
}: { | ||
enabled: boolean; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we make these There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Of course! 4d93ebd |
||
port: number; | ||
logger: Logger; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Logger is not yet used. Do you plan to use it in this PR or a direct follow up? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is used here: https://github.com/confio/ts-relayer/pull/173/files#diff-95ad111838746b4a0030c8e37ff66c5de9df11b1a2eab5aa3f59895c9f70016cR57 or do you think it's better to do console log? I would rather keep the logger to make sure the endpoint is mentioned in the logs. |
||
}): Metrics { | ||
if (initialized) { | ||
throw new Error( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This makes sense to find bugs in our code fast. I was going to suggest to only set |
||
`"setupPrometheus" func shouldn't be initialized more than once.` | ||
); | ||
} | ||
initialized = true; | ||
|
||
if (!enabled) { | ||
return null; | ||
} | ||
|
||
client.collectDefaultMetrics({ prefix: 'confio_relayer_' }); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if we want to collect these? Reference: |
||
const server = http.createServer(async (request, response) => { | ||
if (request.method === 'GET' && request.url === '/metrics') { | ||
const metrics = await client.register.metrics(); | ||
response.writeHead(200, { | ||
'Content-Type': client.register.contentType, | ||
}); | ||
response.end(metrics); | ||
|
||
return; | ||
} | ||
|
||
response.writeHead(404); | ||
response.end('404'); | ||
}); | ||
server.listen(port); | ||
logger.info(`Prometheus GET /metrics exposed on port ${port}.`); | ||
|
||
return getMetrics(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice list of options to set the port. 🌷
Just a note: port
26660
is used by Tendermint as their default port for Prometheus and therefore for any blockchain based on cosmos-sdk like wasmd. This may cause some conflicts when both run on the same machine. 🤷If you want to change this then I would go with
8080
where people expect conflicts anyway :-)