-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Sakshi Sharma edited this page Sep 20, 2023
·
4 revisions
Thesis Plan
1. Abstract
- Brief overview of LSM trees and their importance in modern database systems.
- Significance of dynamic parameter optimization for LSM-based databases.
- Problem statement
- Submit thesis proposal
2. Literature Review
- Review existing literature on LSM-based databases, RocksDB, and parameter optimization techniques like MLOS.
3. Data Generation and Preprocessing (2 week)
- Develop a Python program to generate a representative database.
- Populate RocksDB
- Run workload
- Collect performance metrics
4. Parameter Identification (2 week)
- go through with options.h/advanced_options.h file to identify dynamically configurable parameters
- Implement a profiling mechanism to identify critical parameters that impact performance.
- Analyze the collected data to identify key parameters.
5. Interface design (2 weeks)
- the interface should run rocksdb in the background and collect performance metrics
- run another process parallely to dynamically tune parameters based on the performance metrics
5. Optimization Strategy Design (4 weeks)
- Define the optimization problem - cost model
- Develop an optimization strategy for dynamically adjusting critical parameters.
- Consider MLOS based approaches for adaptive parameter tuning.
- Create an objective function that takes parameters to be optimzed - size ratio, bloom filter settings
6. Parameter Optimization Implementation (3 weeks)
- Implement the designed optimization strategy into RocksDB.
- Develop an interface for real-time parameter adjustments.
- Ensure the system can self-optimize based on performance metrics.
- Plot graphs for performace - latency, throughput, disk space usage
7. Database Benchmarking Framework (3 weeks)
- Develop a comprehensive benchmarking framework - we need to focus on latency and throughput.
- Include various workloads, scalability tests, and performance metrics.
8. Experimental Evaluation (3 weeks)
- Conduct extensive experiments using the generated database.
- Compare the performance of the optimized database against default settings.