Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refactor: Spawn dedicated control loops per Indexer (#866)
This PR introduces dedicated/self-contained control loops per Indexer, replacing the single/combined control loop. The motive for this ticket is described in #811, you can read more about it there. Overall, there is lots of clean up to be done, but I wanted to get this out the door as quick as possible as to not block the features required to build on top of this. I've discussed some of the major concerns below. ## `LifecycleManager` These dedicated control loops are managed by the `LifecycleManager` struct. This is a state machine which progresses the Indexer through different states depending on the context. The different states and their transitions are described on the `LifecycleState` enum: ```rust /// Represents the different lifecycle states of an Indexer #[derive(Default, Debug, Clone, serde::Serialize, serde::Deserialize, PartialEq, Eq)] pub enum LifecycleState { /// Pre-requisite resources, i.e. Data Layer, are being created. /// /// Transitions: /// - `Running` on success /// - `Repairing` on Data Layer provisioning failure #[default] Initializing, /// Indexer is functional, Block Stream and Executors are continouously monitored to ensure /// they are running the latest version of the Indexer. /// /// Transitions: /// - `Stopping` if suspended /// - `Running` if Block Stream or Executor fails to synchronise, essentially triggering a /// retry /// - `Running` on success Running, /// Indexer is being stopped, Block Stream and Executors are being stopped. /// /// Transitions: /// - `Stopping` on failure, triggering a retry /// - `Stopped` on success Stopping, /// Indexer is stopped, Block Stream and Executors are not running. /// /// Transitions: /// - `Running` if unsuspended Stopped, /// Indexer is in a bad state, currently requires manual intervention, but should eventually /// self heal. This is a dead-end state /// /// Transitions: /// - `Repairing` continuously Repairing, // TODO Add `error` to enable reparation /// Indexer is being deleted, all resources are being cleaned up /// /// Transitions: /// - `Deleting` on failure, triggering a retry /// - `Deleted` on success Deleting, /// Indexer is deleted, all resources are cleaned up, lifecycle manager will exit Deleted, } ``` The logic of this `struct` is very light, triggering high-level actions required within each state, and then returning the next desired state. Most of the "doing" logic has has been encapsulated in the other related `structs` as discussed below. The lifecycle state is stored in Redis so that the Indexer can pickup where it left off. A migration has been added to accommodate this new field, which replaces the existing `provisioned_state` field. ## `Handler`s Previously, the "handlers", i.e. `BlockStreamsHandler`, were lightweight `structs` which wrapped the gRPC client/methods. In this PR, I've moved all "synchronisation" logic to these structs. So rather than calling the e.g. `data_layer_handler.start_provisioning_task()` method, we can call `ensure_provisioned()` which manages all related logic. I feel this has been encapsulation, and allows the `LifecycleManager` to be light. I've had to remove `automock`, so we don't have mocked versions for this right now. Cloning mocked versions is tricky, and requires manual mocking. Rather than bloat this PR, I've left this out. Eventually, I'll separate the "sync" logic from the "client" logic, so that the latter can be easily mocked, and the sync logic covered by unit tests. Additionally, I've added `get` methods for both Block Streamer and Executors RPC, as listing is no longer convenient given we are managing Indexers individually. The getters use `account_id` and `function_name` as opposed to IDs. I'm considering moving away from IDs as the only way to get them is via list, which isn't helpful. Right now it's somewhat of a transitory state.
- Loading branch information