Skip to content

caching optimization for high volume read/write on shared data by using data layer and app layer

Notifications You must be signed in to change notification settings

vijiboy/mydatacache

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

client side data caching / optimising

Data caching / optimisation for web apps from client side. Using Test Driven Development.

Analysis

Cache Optimisation depends on multiple levels. It can be a simpler thing like reducing seek time from network to disk to RAM. It can be challenging like anticipating future data use, especially for naturally random data needs.

Done better, Cache improves performance (cache-hit vs cache-miss). Client-Server apps, especially with huge read/write requests on shared data poses such an opportunity e.g. social networking apps, stock trading, multi-player online games etc This opportunity of optimising high volume read/write using advanced caching will be focus of this project.

Design

  • Reduce seek time by caching network files to 1. RAM and 2. disk
  • Cache Priority Index: anticipate future usage by observing usage history to generate ‘cache priority index’, calculated using multiple indicators like access count, last accessed, first accessed, etc.
  • Dictionary will provide quick access to cached items. Can be later optimized to file with saved indexes.
  • Test Driven Development needs a way to demonstrate cache performance: use mocking, etc.
  • Pluggable design can help clean seperation between app and this library e.g. adapter on network & file access classes.
  • Cache performance % can be measured by: cache-hit/total-requests or cache-miss/total-requests
  • Priority Heap as data structure, could help optimise dynamic re-shuffling of items based on ‘cache priority index’. this optimisation, later, can be possible due to TDD.
  • cache is i/o bound thus concurrency is all welcome, to me implies, request queue (producer) and thread pool (consumers).

measuring cache performance: (network cache specifically)

make m repeat requests for n files, for each of n files, measure c1, c2, c3…cn, Where ci = cache-hit/total-requests. Then overall average cache performance C = (c1 + c2 + c3 … cn) / n

measure concurrency performance

make n (say 1000) requests, measure time, should typically be n/100, atleast greater than n/c (where c is number of cores)

partial file caching in dynamic network behavior (hi-speed to no-speed to dis-connected) (feature planned later)

cache priority index algorithm - what to cache and when

predicting future need from past needs -

greedy heuristics (the more it was required in past, the more it will be required in the future),
machine learning / deep learning

predicting future needs from observed pattern of applications

Product Backlog Item

Similar to whatsapp user list or twitter follower/following list: stream line 500 user profiles containing info text, small photo and a big photo.

Absorb network issues (speed, connection time-outs, re-connected) using caching

App. close and re-open shall resume its cache by persisting on disk

About

caching optimization for high volume read/write on shared data by using data layer and app layer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages