-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prod Release 03/07/24 #851
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Also runs `cargo format` (1st commit) closes: #822
This PR introduces back pressure to the Redis Stream in Block Streamer, ensuring that the stream does not exceed a specified maximum length. This is achieved by blocking the `redis.publish_block()` call, intermittently polling the Stream length, and publishing once it falls below the configured limit. To aid testing, the current `RedisClient` struct has been split in to two: - `RedisCommands` - thin wrapper around redis commands to make mocking possible. - `RedisClient` - provides higher-level redis functionality, e.g. "publishing blocks", utilising the above. In most cases, `RedisClient` will be used. The split just allows us to test `RedisWrapper` itself.
This PR adds a Node script to Runner to suspend Indexers due to inactivity. The script will: 1. Call Coordinator to disable the indexer 2. Write to the Indexers logs table to notify of suspension Note that as Coordinator is in a private network, you must tunnel to the machine to expose the gRPC server. This can be achieved via running the following in a separate terminal: ```sh gcloud compute ssh ubuntu@queryapi-coordinator-mainnet -- -L 9003:0.0.0.0:9003 ``` The following environment variables are required: - `HASURA_ADMIN_SECRET` - `HASURA_ENDPOINT` - `PGPORT` - `PGHOST` All of which can be found in the Runner compute instance metadata: ```sh gcloud compute instances describe queryapi-runner-mainnet ``` Usage: `npm run script:suspend-indexer -- <accountId> <functionName>`
… Separate Concerns (#830) Refactored Editor Component to TypeScript. This refactoring involved breaking down the Editor file into smaller chunks and separating concerns into distinct components. Also did some minor work on validator to ts as it is a major consumer in the editor. It is setup to later iterate on some additional test for validators.
Promises without rejection handlers, i.e. `.catch` or `try catch`, will throw "unhandled rejection" errors, which bubble up to the worker thread causing it to exit. This PR adds handlers to the various `simultaneousPromises` triggered within the Executor, to avoid the described.
…843) The current methods for determining both Block Stream and Executor health is flawed. This PR addresses these flaws by adding new, more reliable, metrics for use within Grafana. ### Block Streams A Block Stream is considered healthy if `LAST_PROCESSED_BLOCK` is continuously incremented, i.e. we are continuously downloading blocks from S3. This is flawed for the following reasons: 1. When the Redis Stream if full, we halt the Block Stream, preventing it from processing more blocks 2. When a Block Stream is intentionally stopped, we no longer process blocks To address these flaws, I've introduced a new dedicated metric: `BLOCK_STREAM_UP`, which: - is incremented every time the Block Stream future is polled, i.e. the task is doing work. A static value means unhealthy. - is removed when the Block Stream is stopped, so that it doesn't trigger the false positive described above ### Executors An Executor is considered unhealthy if: it has messages in the Redis Stream, and no reported execution durations. The latter only being recorded on success. The inverse of this is used to determine "healthy". This is flawed for the following reasons: 1. We distinguish the difference between a genuinely broken Indexer, and one broken due to system failures 2. "health" is only determined when there are messages in Redis, meaning we catch the issue later than possible To address these I have added the following metrics: 1. `EXECUTOR_UP` which is incremented on every Executor loop, like above, a static value means unhealthy. 2. `SUCCESSFUL_EXECUTIONS`/`FAILED_EXECUTIONS` which track successful/failed executions directly, rather than tracking using durations. This will be useful for tracking health of specific Indexers, e.g. the `staking` indexer should never have failed executions.
We skip reporting metrics if there are no messages in the pre-fetch queue/Redis Stream. This is especially problematic for `EXECUTOR_UP`, as we won't increment the metric even though we are processing. This PR moves the metrics logic so that it is always reported, even when no messages in the stream.
set scroll past last line in monaco flag to true
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
println
(chore: Removeprintln
#838)catch
blocks to prevent unhandled rejections (fix: Addcatch
blocks to prevent unhandled rejections #842)