Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Runner Streamer Message Acquisition #204

Closed
3 tasks done
Ishatt opened this issue Sep 26, 2023 · 1 comment
Closed
3 tasks done

Optimize Runner Streamer Message Acquisition #204

Ishatt opened this issue Sep 26, 2023 · 1 comment
Assignees

Comments

@Ishatt
Copy link
Contributor

Ishatt commented Sep 26, 2023

Description

Currently the Indexer Runner fetches and constructs the Streamer Message from S3. Instead, we can store real-time messages in Redis to avoid the; amount of duplicate requests, and the latency cost. The message can have a short TTL to avoid consuming too much memory, and we can fallback on fetching the message if it doesn’t exist. Processing of the streamer message such as the call to renameUnderscoreFieldsToCamelCase should be done before caching to further reduce latency.

Publish cache hits/misses to Prometheus so we can fine tune TTL.

Upgrade AWS SDK from v2, which is to be deprecated.

Pre-Fetch Historical Blocks from S3 during historical processing.

Tasks

  1. darunrs
  2. 7 of 7
    darunrs
@darunrs
Copy link
Collaborator

darunrs commented Sep 27, 2023

PR: #241
Ignore. PR Links found in subtasks

@darunrs darunrs linked a pull request Sep 27, 2023 that will close this issue
@darunrs darunrs changed the title Store Streamer Messages in Redis Cache Streamer Messages in Redis Oct 2, 2023
darunrs added a commit that referenced this issue Oct 5, 2023
The streamer message is used by both the coordinator and runner.
However, both currently poll the message from S3. There is a huge
latency impact for pulling the message from S3. In order to improve
this, the streamer message will now be cached in Redis with a TTL and
pulled by runner from Redis. Only if there is a cache miss will runner
pull from S3 again.

Pulling from S3 currently takes up 200-500ms, which is roughly 80-85% of
the overall execution time of a function in runner. By caching the
message, a cache hit leads to loading the data in 1-3ms in local
testing, which corresponds to about 3-5% of the execution time, or a
1100% improvement in latency. 

The reduction of network related activity
to a much lower percentage of execution time also reduces the variability
of a function's execution time greatly. Cache hits and misses will be
logged for further tuning of TTL to reduce cache misses. In addition,
processing the block takes around 1-3ms. This processing has been moved to
be done before caching, saving an extra 1-3ms each time that block is read from
cache. The improvement there will be important for historical backfill, which is planned to be optimized soon.

Tracking Issue: #262
Parent Issue: #204
@darunrs darunrs changed the title Cache Streamer Messages in Redis Optimize Runner Streamer Message Acquisition Oct 5, 2023
darunrs added a commit that referenced this issue Oct 30, 2023
The streamer message is used by both the coordinator and runner.
However, both currently poll the message from S3. There is a huge
latency impact for pulling the message from S3. In order to improve
this, the streamer message will now be cached in Redis with a TTL and
pulled by runner from Redis. Only if there is a cache miss will runner
pull from S3 again.

Pulling from S3 currently takes up 200-500ms, which is roughly 80-85% of
the overall execution time of a function in runner. By caching the
message, a cache hit leads to loading the data in 1-3ms in local
testing, which corresponds to about 3-5% of the execution time, or a
1100% improvement in latency.

The reduction of network related activity
to a much lower percentage of execution time also reduces the variability
of a function's execution time greatly. Cache hits and misses will be
logged for further tuning of TTL to reduce cache misses. In addition,
processing the block takes around 1-3ms. This processing has been moved to
be done before caching, saving an extra 1-3ms each time that block is read from
cache. The improvement there will be important for historical backfill, which is planned to be optimized soon.

Tracking Issue: #262
Parent Issue: #204
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants