fix: Reduce requests made to Near Lake S3 #665

morgsmccauley · 2024-04-15T23:04:09Z

Each BlockStream uses its own dedicated near-lake-framework instance, and hence manages its own connection with S3. This leads to many duplicate S3 requests, particularly across the large majority of Indexers which follow the network tip, which request the same block data at the same time.

This PR introduces a shared S3 client to be used across all near-lake-framework instances. SharedLakeS3Client ensures that duplicate requests made within a short time-frame, including those made in parallel, result in only a single request to S3.

Cache Strategy

This implementation will mostly impact BlockStreams following the network tip, i.e. From Latest. These streams will wait for new data in Near Lake S3, and request it as soon as it is available, at the same time. Therefore, it would be enough to cache the result alone, by the time we actually prime the cache, all other requests would have missed it and fired a request of their own. Locking while the request is in-flight also is not feasible, as this would force every request to execute in sequence.

Instead of caching the result of the request, we cache its computation. The first request initiates the request and stores its Future, then all subsequent requests retrieve that Future from cache and await its result, ensuring only one underlying request at most.

Performance Impact

My main concern with this implementation is the impact it will have on performance. Each request made must block to check the cache, introducing contention/delays. The lock is only held while checking the cache, and not while the request is being made, so my hope is that it does not impact too much. This may be something that needs to be iterated over time.

From local testing the impact seemed to be negligible, but that was with 5 Indexers, it may be worse with many. I've added a metric to measure lock wait time, to determine whether this contention is becoming a problem.

morgsmccauley · 2024-04-16T08:42:01Z

block-streamer/src/lake_s3_client.rs

+        Self::new(s3_client)
+    }
+
+    fn get_object_bytes_shared(&self, bucket: &str, prefix: &str) -> SharedGetObjectBytesFuture {


This is essentially an async fn, and can be awaited as such, but, this Future can be cloned. To achieve this, the values referenced must be cloned to ensure they live long enough.

morgsmccauley · 2024-04-16T08:44:19Z

block-streamer/src/lake_s3_client.rs

+    async fn get_object_bytes_cached(&self, bucket: &str, prefix: &str) -> GetObjectBytesResult {
+        let get_object_bytes_future = self
+            .futures_cache
+            .get_or_set_with(prefix.to_string(), || {


Get/set must be done in one operation at the cache level. Doing them in sequence leaves room for multiple cache writes and therefore multiple requests.

morgsmccauley · 2024-04-16T08:46:07Z

block-streamer/src/lake_s3_client.rs

+        let call_count_clone = s3_get_call_count.clone();
+
+        let mut mock_s3_client = crate::s3_client::S3Client::default();
+        mock_s3_client.expect_clone().returning(move || {


We can't truely clone this mock instance, so therefore can't simply assert that handler was called once(). Each clone gets a completely new instance.

To work around this, I use an atomic counter to count the actual number of requests made.

morgsmccauley · 2024-04-16T08:47:06Z

block-streamer/src/lake_s3_client.rs

+
+        let shared_lake_s3_client = SharedLakeS3ClientImpl::new(LakeS3Client::new(mock_s3_client));
+
+        let barrier = Arc::new(Barrier::new(10));


Blocks execution until it has been waited the specified number of times (10), making the requests fire in parallel.

Yeah we covered this in the meeting. We can probably try bumping the parallel threads/requests to something like 50 to more consistently test this behavior.

morgsmccauley · 2024-04-16T08:49:06Z

block-streamer/src/s3_client.rs

@@ -52,7 +51,7 @@ impl S3ClientImpl {
            .list_objects_v2()
            .delimiter("/")
            .bucket(bucket)
-            .prefix(prefix);


This limits the keys to only those which begin with prefix, so for listing block heights, only 1 value is ever returned.

This may have been intentional for DeltaLakeClient, still need to confirm this doesn't break anything there.

This was actually a breaking change - I reverted so existing functionality uses the original implementation, and this PR uses a new method.

morgsmccauley · 2024-04-16T08:50:00Z

block-streamer/src/s3_client.rs

@@ -120,3 +119,42 @@ impl S3ClientImpl {
        Ok(results)
    }
 }
+
+#[cfg(test)]
+mockall::mock! {


Changed to "manual" mock implementation so that .clone() can also be mocked. The default mock doesn't implement Clone.

morgsmccauley · 2024-04-16T09:04:22Z

block-streamer/src/lake_s3_client.rs

-#[cfg(not(test))]
-pub use LakeS3ClientImpl as LakeS3Client;
+/// Number of files added to Near Lake S3 per hour
+const CACHE_SIZE: usize = 18_000;


With 1 block produced per second - 60 seconds x 60 minutes x 5 files per block.

So roughly caching 1 hours worth of blocks?

I'm trying to think how backfills would impact what constitutes the cache size. Backfill block requests might force out some of the queue size to be usable for catchups, but I think a lot would need to occur simultaneously for that to happen. I think this should work in the main scenarios I would expect us to gain the most benefits from:

When Block Streamer starts up after a pause.

When many streams are up to date.

Backfills do use Delta Lake rather than Near Lake, so won't actually use this cache. But after that, you're right, they will start to flood it.

We can definitely adjust this value, or whole caching strategy, as we go :)

darunrs

Super cool change!

darunrs · 2024-04-17T22:16:14Z

block-streamer/src/lake_s3_client.rs

-#[cfg(not(test))]
-pub use LakeS3ClientImpl as LakeS3Client;
+/// Number of files added to Near Lake S3 per hour
+const CACHE_SIZE: usize = 18_000;


I'm trying to think how backfills would impact what constitutes the cache size. Backfill block requests might force out some of the queue size to be usable for catchups, but I think a lot would need to occur simultaneously for that to happen. I think this should work in the main scenarios I would expect us to gain the most benefits from:

When Block Streamer starts up after a pause.

When many streams are up to date.

darunrs · 2024-04-17T22:19:38Z

block-streamer/src/lake_s3_client.rs

+
+        let shared_lake_s3_client = SharedLakeS3ClientImpl::new(LakeS3Client::new(mock_s3_client));
+
+        let barrier = Arc::new(Barrier::new(10));


Yeah we covered this in the meeting. We can probably try bumping the parallel threads/requests to something like 50 to more consistently test this behavior.

morgsmccauley changed the base branch from main to feat/count-lake-requests April 15, 2024 23:04

morgsmccauley force-pushed the fix/duplicate-lake-requests branch from 49551e6 to 2ad6c4c Compare April 15, 2024 23:46

morgsmccauley changed the title ~~fix/duplicate lake requests~~ fix: Reduce S3 requests made by near-lake-framework Apr 16, 2024

morgsmccauley mentioned this pull request Apr 16, 2024

feat: Allow custom configuration of S3Client near/near-lake-framework-rs#102

Merged

morgsmccauley changed the title ~~fix: Reduce S3 requests made by near-lake-framework~~ fix: Reduce requests made to Near Lake S3 Apr 16, 2024

morgsmccauley force-pushed the fix/duplicate-lake-requests branch 2 times, most recently from 05628d3 to 9c436e3 Compare April 16, 2024 08:18

morgsmccauley commented Apr 16, 2024

View reviewed changes

morgsmccauley marked this pull request as ready for review April 16, 2024 08:50

morgsmccauley requested a review from a team as a code owner April 16, 2024 08:50

morgsmccauley linked an issue Apr 16, 2024 that may be closed by this pull request

Reduce duplicate Lake requests across dedicated streams #419

Closed

morgsmccauley commented Apr 16, 2024

View reviewed changes

morgsmccauley force-pushed the fix/duplicate-lake-requests branch from 9c436e3 to b05e807 Compare April 16, 2024 22:16

Base automatically changed from feat/count-lake-requests to main April 17, 2024 01:39

darunrs approved these changes Apr 17, 2024

View reviewed changes

morgsmccauley added 9 commits April 18, 2024 20:00

feat: Use shared s3 client across all Near Lake processes

9320fc8

refactor: Replace with s3_client wrapper to allow mocking

3cbcd5d

test: impl Clone for S3Client wrapper

7dcb9df

feat: Cache & deduplicate concurrent s3 requests

63b2c44

test: Add tests for SharedLakeS3Client

02de1a0

fix: Ensure list_objects() lists more than 1 object

b8e058c

refactor: Cache fixed amount of s3 futures

cc8e62c

feat: Measure lock wait time in LakeS3Client

55f60a6

feat: Create separete s3 prefix/start_after methods

0f4fea5

morgsmccauley force-pushed the fix/duplicate-lake-requests branch from 4b1d2a1 to 0f4fea5 Compare April 18, 2024 08:00

test: Make lake_s3_client tests less flaky (hopefully)

e2eba38

morgsmccauley merged commit 29d853c into main Apr 18, 2024
4 checks passed

morgsmccauley deleted the fix/duplicate-lake-requests branch April 18, 2024 08:08

This was referenced Apr 22, 2024

Prod Release 23/04/24 #686

Merged

test stable branch git fix up #687

Closed

git fixup test 2 #688

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Reduce requests made to Near Lake S3 #665

fix: Reduce requests made to Near Lake S3 #665

morgsmccauley commented Apr 15, 2024 •

edited

Loading

morgsmccauley Apr 16, 2024

morgsmccauley Apr 16, 2024

morgsmccauley Apr 16, 2024

morgsmccauley Apr 16, 2024

darunrs Apr 17, 2024

morgsmccauley Apr 16, 2024

morgsmccauley Apr 18, 2024

morgsmccauley Apr 16, 2024

morgsmccauley Apr 16, 2024

darunrs Apr 17, 2024

morgsmccauley Apr 18, 2024

darunrs left a comment

darunrs Apr 17, 2024

darunrs Apr 17, 2024


		let shared_lake_s3_client = SharedLakeS3ClientImpl::new(LakeS3Client::new(mock_s3_client));

		let barrier = Arc::new(Barrier::new(10));

fix: Reduce requests made to Near Lake S3 #665

fix: Reduce requests made to Near Lake S3 #665

Conversation

morgsmccauley commented Apr 15, 2024 • edited Loading

Cache Strategy

Performance Impact

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darunrs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

morgsmccauley commented Apr 15, 2024 •

edited

Loading