-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prod Release 06/02/24 #546
Commits on Oct 21, 2023
-
fix: upgrade @next/font from 13.1.6 to 13.5.3
Snyk has created this PR to upgrade @next/font from 13.1.6 to 13.5.3. See this package in npm: See this project in Snyk: https://app.snyk.io/org/gabehamilton/project/f1490843-1830-4eb0-a957-99816aa5edcc?utm_source=github&utm_medium=referral&page=upgrade-pr
Configuration menu - View commit details
-
Copy full SHA for c71b7d1 - Browse repository at this point
Copy the full SHA c71b7d1View commit details -
fix: upgrade @types/node from 18.13.0 to 18.18.1
Snyk has created this PR to upgrade @types/node from 18.13.0 to 18.18.1. See this package in npm: See this project in Snyk: https://app.snyk.io/org/gabehamilton/project/f1490843-1830-4eb0-a957-99816aa5edcc?utm_source=github&utm_medium=referral&page=upgrade-pr
Configuration menu - View commit details
-
Copy full SHA for d4ff2ba - Browse repository at this point
Copy the full SHA d4ff2baView commit details -
fix: upgrade @types/react from 18.0.28 to 18.2.23
Snyk has created this PR to upgrade @types/react from 18.0.28 to 18.2.23. See this package in npm: See this project in Snyk: https://app.snyk.io/org/gabehamilton/project/f1490843-1830-4eb0-a957-99816aa5edcc?utm_source=github&utm_medium=referral&page=upgrade-pr
Configuration menu - View commit details
-
Copy full SHA for d4c40e2 - Browse repository at this point
Copy the full SHA d4c40e2View commit details -
fix: upgrade eslint from 8.34.0 to 8.50.0
Snyk has created this PR to upgrade eslint from 8.34.0 to 8.50.0. See this package in npm: See this project in Snyk: https://app.snyk.io/org/gabehamilton/project/f1490843-1830-4eb0-a957-99816aa5edcc?utm_source=github&utm_medium=referral&page=upgrade-pr
Configuration menu - View commit details
-
Copy full SHA for 64c06e5 - Browse repository at this point
Copy the full SHA 64c06e5View commit details
Commits on Oct 26, 2023
-
fix: frontend/package.json to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade: - https://snyk.io/vuln/SNYK-JS-NEXT-6032387
Configuration menu - View commit details
-
Copy full SHA for 34cb33d - Browse repository at this point
Copy the full SHA 34cb33dView commit details
Commits on Jan 24, 2024
-
feat: Capture errors thrown within Coordinator (#515)
Currently, errors thrown within Coordinator V2 will bubble up to `main()` and cause the entire application to exit. This PR captures those errors, handling them accordingly. Errors are handled in either of the following ways: 1. Exponentially retry - data which is 'critical' to the control, i.e. the indexer registry, executor/stream list, will be continuously retried, blocking the control loop from further progress as it would not make sense to continue without this information. 2. Swallowed - actions such as starting/stopping executors/streams will be logged and swallowed. This is preferable over exponential retries as individual failures will not block the progress of the control loop, therefore allowing other indexers to be acted on. Skipping should be fine in this case as it will be retried in the next loop. I expect this behaviour to evolve over time as we learn more about the system, the important thing here is that Coordinator will not crash on errors.
Configuration menu - View commit details
-
Copy full SHA for aac2273 - Browse repository at this point
Copy the full SHA aac2273View commit details -
feat: Only start indexers set within the
allowlist
(#518)This PR updates Coordinator V2 to read an `allowlist` from Redis, and only start accounts which have been added to that list. The `allowlist` is a single key in Redis stored as stringified JSON, e.g.: ```json [{ "account_id": "morgs.near" }] ``` I've used JSON as each entry will eventually contain more fields, such as an acknowledgement from Coordinator V1, and a flag to set when completing the migration. Additionally, I've added various logs across Coordinator V2.
Configuration menu - View commit details
-
Copy full SHA for acec01a - Browse repository at this point
Copy the full SHA acec01aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 32d6fa9 - Browse repository at this point
Copy the full SHA 32d6fa9View commit details
Commits on Jan 25, 2024
-
feat: Support Deployment of Block Streamer (#516)
Block Streamer needs a Dockerfile in order to build the image and be deployable either locally or in GCP. I've created one for the service and updated the compose file to set the correct env variables needed. This also sserves as a record for what ENV variables need to be set during Terraform deployments to GCP. In addition, there's a small issue with running QueryApi locally where if the user runs Runner locally through yarn start, provisioning of resources through hasura fails because it populates the postgres host as 'localhost' whereas it should be 'postgres', as the calls are processed inside the hasura docker. This is due to PGHOST being both used and also passed to Hasura inside Runner. To fix this, I created a separate env variable called PGHOST_HASURA which can be set to 'postgres'. This way, Runner can use 'localhost' while it passes 'postgres' to hasura's docker.
Configuration menu - View commit details
-
Copy full SHA for a431a77 - Browse repository at this point
Copy the full SHA a431a77View commit details -
feat: Enable Block Streams Start from V1 Interruption (#517)
Coordinator V2 currently defaults to starting block streams with the start_block_height value populated in the registry. In order for a smooth migration from V1 to V2, we instead want Coordinator V2 to create block streams starting where the current indexers left off. Coordinator V1 now writes a last_published_block value for all indexers it manages on each matching block. Coordinator V2's start_block_height setting workflow needs refactoring. This is due to the fact that the exisitng flow defauls to using start_block_height as its present in the registry even after we begin processing a stream from that block height. We need to refactor the flow to ensure we use the start block height once, and otherwise use redis last published block height, or fallback if not present. New flow to handle edge cases will be tracked by #521.
Configuration menu - View commit details
-
Copy full SHA for 920407a - Browse repository at this point
Copy the full SHA 920407aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0b6c08a - Browse repository at this point
Copy the full SHA 0b6c08aView commit details -
Merge pull request #314 from near/snyk-upgrade-4ba709d36fa2e38a5ac110…
…5f6ae38549 [Snyk] Upgrade @types/node from 18.13.0 to 18.18.1
Configuration menu - View commit details
-
Copy full SHA for c5b1e41 - Browse repository at this point
Copy the full SHA c5b1e41View commit details -
Configuration menu - View commit details
-
Copy full SHA for f568bd0 - Browse repository at this point
Copy the full SHA f568bd0View commit details -
Merge pull request #313 from near/snyk-upgrade-c7dc22e6846ada68552465…
…a4cbc363b0 [Snyk] Upgrade @next/font from 13.1.6 to 13.5.3
Configuration menu - View commit details
-
Copy full SHA for a23c51e - Browse repository at this point
Copy the full SHA a23c51eView commit details -
feat: Logging & Error updates (#526)
- feat: Cap exponential retry to 30 secs - feat: Debug log stream/executor responses - feat: Add more logging to coordinator
Configuration menu - View commit details
-
Copy full SHA for 46e363c - Browse repository at this point
Copy the full SHA 46e363cView commit details
Commits on Jan 26, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b6fc249 - Browse repository at this point
Copy the full SHA b6fc249View commit details -
Merge pull request #315 from near/snyk-upgrade-a7fd032540f59a0e0521b8…
…386f1e3465 [Snyk] Upgrade @types/react from 18.0.28 to 18.2.23
Configuration menu - View commit details
-
Copy full SHA for 6116c53 - Browse repository at this point
Copy the full SHA 6116c53View commit details -
Merge pull request #334 from near/snyk-fix-415cfdc1a838c227b64a7b254d…
…253c56 [Snyk] Security upgrade next from 13.1.6 to 13.5.0
Configuration menu - View commit details
-
Copy full SHA for 2317f36 - Browse repository at this point
Copy the full SHA 2317f36View commit details -
Configuration menu - View commit details
-
Copy full SHA for 102409a - Browse repository at this point
Copy the full SHA 102409aView commit details -
Merge pull request #316 from near/snyk-upgrade-fdf2121163d14312e34f79…
…6d4e4ec0e4 [Snyk] Upgrade eslint from 8.34.0 to 8.50.0
Configuration menu - View commit details
-
Copy full SHA for 70ea36d - Browse repository at this point
Copy the full SHA 70ea36dView commit details
Commits on Jan 31, 2024
-
fix: Prevent Coordinator from stopping V1 executors (#544)
As V1 executors are returned via Runner gRPC, Coordinator V2 will attempt to synchronise them. These executors should be completely ignored, within the synchronisation logic.
Configuration menu - View commit details
-
Copy full SHA for 593f305 - Browse repository at this point
Copy the full SHA 593f305View commit details
Commits on Feb 1, 2024
-
fix: Resolve duplicate processing of messages in Runner (#545)
It was observed that Runner would execute on the same block multiple times. This was verified again when @Kevin101Zhang saw that the components indexer, which he modified, incorrectly incremented the star counter. I noticed some indexers such as sweat_blockheight very rarely had duplicate runs. To validate if Runner was the problem, I shut down Runner in dev and triggered some social feed interactions. I verified that the block only appeared in Redis once, indicating Coordinator was not the problem. When I started Runner again, the problem appeared again. After that, I worked on replicating the issue locally. I built up 3 messages on a social feed indexer and modified Runner to write its executions to a file. Afterward, I ran Runner. I searched the file and found that the 3 blocks actually appeared in sequence like so: block 1, block 2, block 3, block 1, block 2, block 3... and so on. This seems to indicate duplicate workers was the problem but instead that the same message was being read into the block array after they were all read. Finally, with that I found the problem. The way Runner fills its array of S3 promises, it reads a stream message and increments a stored message ID. Subsequent stream fetches specify to fetch messages after that stream ID. This is needed as deletion of stream messages can only take place after the message is SUCCESSFULLY processed, to avoid problems if Runner was reset. However, the code which handles the case where no messages are available in the stream reset that counting stream ID to '0', as a mechanism to ensure messages that were somehow skipped are definitely read. This resetting of the ID ended up being the cause. To illustrate, here's a scenario: 1. The stream has 3 messages in it. 2. Runner starts with 0 and then reads all 3 messages. The counting stream ID is the ID of the last read message + 1. 3. The messages are added to the array as promises. 4. The first message begins to process. 5. The producer loop sees that no messages are found anymore. 6. **The producer waits 100ms and then resets the ID.** 7. The producer, while the first message was processing, fetches the same 3 messages again. 8. This will repeat as long as no new messages appear, while messages are in the promise array but not processed. The fix is simple: Remove the resetting of the stream ID. It is no longer necessary as the problem of remaining messages in the stream was fixed back in December.
Configuration menu - View commit details
-
Copy full SHA for 10e2963 - Browse repository at this point
Copy the full SHA 10e2963View commit details
Commits on Feb 5, 2024
-
fix: Executors would crash when DmlHandler.create times out (#547)
In each invocation of runFunction, on a block, DML Handler is created. It's responsible for making calls to the database. Part of setting it up involves fetching the credentials for the user's DB from Hasura, and creating a PG Client. This takes time so the process was run through an unawaited async request. While other setup and some of user code is ran, the setting up of DML Handler would be completed. The first context.db call would await its completion and subsequent calls would have it ready. However, it was observed that when the call to Hasura for the DB credentials times out, the error, instead of propagating into a try catch, would instead be considered by the Javascript runtime as an unhandled Promise Exception, and would terminate the worker thread, stopping the indexer. In order to fix this problem, we need to transition away from keeping DmlHandler.create as an unresolved Promise across multiple contexts. The approach I've decided to take is to defer the creation of the Hasura call promise until the first call of context.db. This adds more latency to the first context.db call as it now must wait for the entire process to complete. However, this also does not penalize Indexers that don't use context.db as their code does not need to connect to Hasura unless needed. Very soon, we will in fact overhaul this existing logic by migrating the Hasura credentials call away from runFunctions. This eliminates the underlying problem of unresolved promises as none remain afterward. So, the focus here is to address the bug, which is a critical problem, without too much change, as the workflow will be refactored again soon anyway. I also fix a small bug where context.db calls were getting logged under the wrong indexer logs table function name.
Configuration menu - View commit details
-
Copy full SHA for 2c28dc3 - Browse repository at this point
Copy the full SHA 2c28dc3View commit details
Commits on Feb 6, 2024
-
feat: Auto migrate indexers to Control Plane (#527)
This PR builds on the Redis `allowlist` to auto migrate indexers from the existing infrastructure to the Control Plane. An account migration requires coordination between both the V1 and V2 architecture - the indexer must be removed/ignored from V1, and then correctly configured within V2. ## Allowlist shape Each `allowlist` entry now contain the following properties: ```rs pub struct AllowlistEntry { account_id: AccountId, // The account which should be migrated v1_ack: bool, // True if Coordinator V1 has acknowledged the entry migrated: bool, // True if the migration was successful failed: bool, // True if the migration failed } ``` ## Coordinator V1 For Coordinator V1, the `allowlist` is really a Denylist, the code/types have therefore been named as such. Accounts within the "denylist" should be ignored completely by V1. Because we cannot guarantee the timing of when this "ignore" actually happens, a flag (`v1_ack`) will be set from V1. V2 will only take over once this flag has been set. Accounts within the "denylist" will be filtered out of the in-memory registry. Any new indexer "registrations" will also be ignored. In-progress historical backfills haven't been considered as we'll disable this functionality anyway. ## Coordinator V2 Once acknowledged by V1, Coordinator V2 will attempt to migrate all functions under the relevant account. The steps for migration are: 1. Remove the streams from the Redis `streams` set - preventing Runner from starting these indexers implicitly 2. Stop the existing V1 executors which have already been started via the `streams` set 3. Merge the existing historical (if it exists) and real-time streams Once migrated, accounts which have `v1_ack && migrated && !failed` will be exposed to the control loop, prompting V2 to act on these indexers. ### `migrated` flag For now, the `migrated` flag will not be set on success, preventing V2 from running the indexer on the new architecture. There are some issues around V2 continuing from the right block correctly, so it's best to not run them for now. This allows us to test the migration in isolation, not worrying about what V2 does after that. I'll add this logic back in once #536 is complete. ### `failed` flag If any part of the migration fails, the `failed` flag will be set for that account. It would take a significant amount of time to cover all the edge cases in code so it would be faster to just set this flag to ignore the account, fix the migration manually and then reset the `failed` flag.
Configuration menu - View commit details
-
Copy full SHA for 9ef1f29 - Browse repository at this point
Copy the full SHA 9ef1f29View commit details -
fix: Crashed Runner Executors would continue to display RUNNING (#550)
Runner sets the status of an Indexer after a successful run or a failed run whose error was caught. However, if the executor itself crashed, the error would not be caught as the worker is terminated. As a result, the status of the indexer would continue to display RUNNING, which was incorrect and misleading. I updated the status by ensuring crashed workers have the STOPPED status set. In addition, I added a new FAILING status which will now be set when an indexer is still running, but failing on the same block.
Configuration menu - View commit details
-
Copy full SHA for fd64a8c - Browse repository at this point
Copy the full SHA fd64a8cView commit details