feat: Count S3 get requests made by `near-lake-framework` #662

morgsmccauley · 2024-04-15T03:53:55Z

Depends on near/near-lake-framework-rs#102

This PR exposes a new metrics which counts the number of Get requests made to S3 by near-lake-framework. I wanted to start tracking this metric before I merge the change which reduces them, so I can measure the impact of that change. The easiest way to track these requests was to pass a custom S3Client to near-lake-framework, so we can hook in to the actual requests made.

The custom S3Client (LakeS3Client) is exactly the same as the default implementation in near-lake-framework itself, but with the added metric. This is essentially part 1 for #419, as the "reduction" in requests will build on this custom client, adding caching/de-duplication.

block-streamer/Cargo.toml

morgsmccauley · 2024-04-15T03:56:57Z

block-streamer/src/block_stream.rs


+    // FIX: near lake framework now infinitely retires - we need a way to stop it to allow the test


Will try to address this in a future PR, will probably need to add a config option to Near Lake which prevents infinite retries.

What change to Lake caused infinite retries?

morgsmccauley · 2024-04-15T03:57:48Z

block-streamer/src/lake_s3_client.rs

+}
+
+#[cfg(test)]
+mockall::mock! {


Unable to use usual #[automock] due to the multiple impl blocks and trait implementation, so have to use this custom mock! block. The end result is essentially the same.

morgsmccauley · 2024-04-15T03:58:22Z

block-streamer/src/rules/matcher.rs

                || wildmatch::WildMatch::new(account_id)
-                    .matches(&outcome_with_receipt.receipt.predecessor_id)
+                    .matches(outcome_with_receipt.receipt.predecessor_id.as_str())


Latest near-lake-framework changed these to AccountId from String

darunrs

Sweet work! Excited to see the next PR with the caching!

darunrs · 2024-04-16T23:45:10Z

block-streamer/src/block_stream.rs

@@ -116,6 +116,7 @@ impl BlockStream {
    }
 }

+#[allow(clippy::too_many_arguments)]


Is this call out an indication that we need to rethink the function declaration? Either encapsulate some or all of these into some BlockStreamContext, or roll some of these into each other (redis stream does into IndexerConfig, Lake Framework stream created before being passed into start_block_stream, etc)?

Yeah - this suppresses the error which I was getting sick of. BlockStreamContext is a good idea.

darunrs · 2024-04-16T23:46:36Z

block-streamer/src/block_stream.rs


+    // FIX: near lake framework now infinitely retires - we need a way to stop it to allow the test


What change to Lake caused infinite retries?

darunrs · 2024-04-16T23:56:30Z

block-streamer/src/lake_s3_client.rs

+        let response = self
+            .s3_client
+            .list_objects_v2()
+            .max_keys(1000)


Is this a value we set somewhere previously or was 1000 the default by the API?

This is the default from the API, essentially just a copy paste from near_lake_framework code.

darunrs · 2024-04-16T23:58:56Z

block-streamer/src/lake_s3_client.rs

+        Ok(bytes)
+    }
+
+    async fn list_common_prefixes(


What is this function used for? Just testing or something else?

This is part of the S3Client trait exposed by near_lake_framework. Under the hood it's used to list block heights, and was originally added to mock those heights returned. We don't do anything with it now, but still need to supply it.

We may want to add caching here too, but it's a bit more complicated because the arguments will likely vary widely across all near_lake_framework instances.

morgsmccauley requested a review from a team as a code owner April 15, 2024 03:53

morgsmccauley added 4 commits April 15, 2024 15:55

chore: Upgrade near-lake-framework

ad8f20d

feat: Create initial LakeS3Client

bab4627

feat: Configure S3Client used by near-lake-framework

f42f843

feat: Count get requests made by near-lake-framework

e30878d

morgsmccauley force-pushed the feat/count-lake-requests branch from 69a1434 to e30878d Compare April 15, 2024 03:55

morgsmccauley commented Apr 15, 2024

View reviewed changes

block-streamer/Cargo.toml Outdated Show resolved Hide resolved

morgsmccauley commented Apr 15, 2024

View reviewed changes

morgsmccauley linked an issue Apr 15, 2024 that may be closed by this pull request

Reduce duplicate Lake requests across dedicated streams #419

Closed

ci: Update rust version in github actions

bd3d75c

morgsmccauley force-pushed the feat/count-lake-requests branch from 7b59de0 to bd3d75c Compare April 15, 2024 04:03

build: Pin near-lake-framework version

1ffbd48

darunrs approved these changes Apr 17, 2024

View reviewed changes

morgsmccauley merged commit 733c3c6 into main Apr 17, 2024
4 checks passed

morgsmccauley deleted the feat/count-lake-requests branch April 17, 2024 01:39

This was referenced Apr 22, 2024

Prod Release 23/04/24 #686

Merged

test stable branch git fix up #687

Closed

git fixup test 2 #688

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Count S3 get requests made by `near-lake-framework` #662

feat: Count S3 get requests made by `near-lake-framework` #662

morgsmccauley commented Apr 15, 2024 •

edited

Loading

morgsmccauley Apr 15, 2024

darunrs Apr 16, 2024

morgsmccauley Apr 15, 2024

morgsmccauley Apr 15, 2024

darunrs left a comment

darunrs Apr 16, 2024

morgsmccauley Apr 17, 2024

darunrs Apr 16, 2024

darunrs Apr 16, 2024

morgsmccauley Apr 17, 2024

darunrs Apr 16, 2024

morgsmccauley Apr 17, 2024


		// FIX: near lake framework now infinitely retires - we need a way to stop it to allow the test

feat: Count S3 get requests made by near-lake-framework #662

feat: Count S3 get requests made by near-lake-framework #662

Conversation

morgsmccauley commented Apr 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darunrs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

feat: Count S3 get requests made by `near-lake-framework` #662

feat: Count S3 get requests made by `near-lake-framework` #662

morgsmccauley commented Apr 15, 2024 •

edited

Loading