-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Runner Streamer Message Acquisition #204
Comments
|
darunrs
changed the title
Store Streamer Messages in Redis
Cache Streamer Messages in Redis
Oct 2, 2023
darunrs
added a commit
that referenced
this issue
Oct 5, 2023
The streamer message is used by both the coordinator and runner. However, both currently poll the message from S3. There is a huge latency impact for pulling the message from S3. In order to improve this, the streamer message will now be cached in Redis with a TTL and pulled by runner from Redis. Only if there is a cache miss will runner pull from S3 again. Pulling from S3 currently takes up 200-500ms, which is roughly 80-85% of the overall execution time of a function in runner. By caching the message, a cache hit leads to loading the data in 1-3ms in local testing, which corresponds to about 3-5% of the execution time, or a 1100% improvement in latency. The reduction of network related activity to a much lower percentage of execution time also reduces the variability of a function's execution time greatly. Cache hits and misses will be logged for further tuning of TTL to reduce cache misses. In addition, processing the block takes around 1-3ms. This processing has been moved to be done before caching, saving an extra 1-3ms each time that block is read from cache. The improvement there will be important for historical backfill, which is planned to be optimized soon. Tracking Issue: #262 Parent Issue: #204
darunrs
changed the title
Cache Streamer Messages in Redis
Optimize Runner Streamer Message Acquisition
Oct 5, 2023
darunrs
added a commit
that referenced
this issue
Oct 30, 2023
The streamer message is used by both the coordinator and runner. However, both currently poll the message from S3. There is a huge latency impact for pulling the message from S3. In order to improve this, the streamer message will now be cached in Redis with a TTL and pulled by runner from Redis. Only if there is a cache miss will runner pull from S3 again. Pulling from S3 currently takes up 200-500ms, which is roughly 80-85% of the overall execution time of a function in runner. By caching the message, a cache hit leads to loading the data in 1-3ms in local testing, which corresponds to about 3-5% of the execution time, or a 1100% improvement in latency. The reduction of network related activity to a much lower percentage of execution time also reduces the variability of a function's execution time greatly. Cache hits and misses will be logged for further tuning of TTL to reduce cache misses. In addition, processing the block takes around 1-3ms. This processing has been moved to be done before caching, saving an extra 1-3ms each time that block is read from cache. The improvement there will be important for historical backfill, which is planned to be optimized soon. Tracking Issue: #262 Parent Issue: #204
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Currently the Indexer Runner fetches and constructs the Streamer Message from S3. Instead, we can store real-time messages in Redis to avoid the; amount of duplicate requests, and the latency cost. The message can have a short TTL to avoid consuming too much memory, and we can fallback on fetching the message if it doesn’t exist. Processing of the streamer message such as the call to renameUnderscoreFieldsToCamelCase should be done before caching to further reduce latency.
Publish cache hits/misses to Prometheus so we can fine tune TTL.
Upgrade AWS SDK from v2, which is to be deprecated.
Pre-Fetch Historical Blocks from S3 during historical processing.
Tasks
The text was updated successfully, but these errors were encountered: