Skip to content
Sakshi Sharma edited this page Sep 20, 2023 · 4 revisions

Thesis Plan

1. Abstract

  • Brief overview of LSM trees and their importance in modern database systems.
  • Significance of dynamic parameter optimization for LSM-based databases.
  • Problem statement
  • Submit thesis proposal

2. Literature Review

  • Review existing literature on LSM-based databases, RocksDB, and parameter optimization techniques like MLOS.

3. Data Generation and Preprocessing (2 week)

  • Develop a Python program to generate a representative database.
  • Populate RocksDB
  • Run workload
  • Collect performance metrics

4. Parameter Identification (2 week)

  • go through with options.h/advanced_options.h file to identify dynamically configurable parameters
  • Implement a profiling mechanism to identify critical parameters that impact performance.
  • Analyze the collected data to identify key parameters.

5. Interface design (2 weeks)

  • the interface should run rocksdb in the background and collect performance metrics
  • run another process parallely to dynamically tune parameters based on the performance metrics

5. Optimization Strategy Design (4 weeks)

  • Define the optimization problem - cost model
  • Develop an optimization strategy for dynamically adjusting critical parameters.
  • Consider MLOS based approaches for adaptive parameter tuning.
  • Create an objective function that takes parameters to be optimzed - size ratio, bloom filter settings

6. Parameter Optimization Implementation (3 weeks)

  • Implement the designed optimization strategy into RocksDB.
  • Develop an interface for real-time parameter adjustments.
  • Ensure the system can self-optimize based on performance metrics.
  • Plot graphs for performace - latency, throughput, disk space usage

7. Database Benchmarking Framework (3 weeks)

  • Develop a comprehensive benchmarking framework - we need to focus on latency and throughput.
  • Include various workloads, scalability tests, and performance metrics.

8. Experimental Evaluation (3 weeks)

  • Conduct extensive experiments using the generated database.
  • Compare the performance of the optimized database against default settings.
Clone this wiki locally