Skip to content
This repository has been archived by the owner on Sep 27, 2019. It is now read-only.

[15721]Index Tuning with RL #1338

Open
wants to merge 393 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
393 commits
Select commit Hold shift + click to select a range
b5ba92a
Merge branch 'temp' of github.com:Blade-Lee/peloton into temp
Blade-Lee Apr 28, 2018
a5d7bec
optimized offset computation
Blade-Lee Apr 28, 2018
ff78367
added offset_to_index mapping, added test cases
Blade-Lee Apr 28, 2018
3b649c1
Merge branch 'master' of https://github.com/Blade-Lee/peloton into br…
saatviks May 1, 2018
3f5b3b5
Merge remote-tracking branch 'origin/temp' into brain_rl_testing_fram…
saatviks May 1, 2018
4e5932d
added comments
Blade-Lee May 1, 2018
6d65979
Adding Eigen components
saatviks May 1, 2018
a014177
changed to unique_ptr
Blade-Lee May 1, 2018
91088bc
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 1, 2018
039eac6
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 1, 2018
a6b93f6
Extra utility functions
saatviks May 1, 2018
d0fbf35
completed AddCandidates()
Blade-Lee May 1, 2018
825df53
fixed conflicts
Blade-Lee May 1, 2018
3fa965d
fixed AddIndex API issue
Blade-Lee May 1, 2018
a299a40
added DropCandidates()
Blade-Lee May 1, 2018
da99415
added tests for add/drop candidates (not finished)
Blade-Lee May 1, 2018
2f7818f
Begin LSPI Index Tuning Components
saatviks May 1, 2018
67fd803
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 1, 2018
2ed594f
fixed AddCandidates() bug
Blade-Lee May 1, 2018
7c7c80a
Function template modifications
saatviks May 1, 2018
2bb4c42
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 1, 2018
c7963f6
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 1, 2018
c962b95
added strings to index_object function
Blade-Lee May 1, 2018
4ef9924
merge fix conflicts
Blade-Lee May 1, 2018
f49a106
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 1, 2018
87b087f
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 1, 2018
166cc22
completed tests for add/drop candidates
Blade-Lee May 1, 2018
5c9fa19
finished ignore_primary implementation in plan_util & tests
Blade-Lee May 2, 2018
bef78bc
Feature constructors
saatviks May 2, 2018
8113171
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 2, 2018
77a1811
renamed as CompressedIndexConfigContainer
Blade-Lee May 3, 2018
8c35551
separate CompressedIndexConfigContainer from CompressedIndexConfigMan…
Blade-Lee May 3, 2018
7269730
Manager to Util class conversion + Further decoupling
saatviks May 3, 2018
1300bb9
renamed AddIndex->SetBit, RemoveIndex->UnsetBit
Blade-Lee May 3, 2018
27de70a
Optimal config search
saatviks May 3, 2018
34f60ae
added AdjustIndexes()
Blade-Lee May 3, 2018
975bf17
merged with Saatvik's code
Blade-Lee May 3, 2018
1df6bce
renamed to CompressedIdxConfigTest
Blade-Lee May 3, 2018
72ba3fa
Corrections + Value fn addition
saatviks May 3, 2018
5e407fd
added TunerTest
Blade-Lee May 3, 2018
7203775
merged with Saatvik's code
Blade-Lee May 3, 2018
3f4ba9d
fixed AdjustIndexes bug
Blade-Lee May 3, 2018
e3c2723
Formatting + Minor bug fixes
saatviks May 4, 2018
6cdb99a
fixed AdjustIndexes bug
Blade-Lee May 5, 2018
d18033d
added the files for cost evaluation
pbollimp Mar 29, 2018
5fdadea
llvm for mac
vkonagar Mar 29, 2018
ec6c94b
Basic classes
sivaprasadsudhir Mar 30, 2018
492b95f
added the configuration enumeration files
pbollimp Mar 30, 2018
8410136
Add Whatif API
vkonagar Mar 30, 2018
96eadf4
Add optimizer cost query func skeleton
vkonagar Mar 30, 2018
9087931
Complete what if API implementation. Testing pending.
vkonagar Apr 5, 2018
0908588
Ignore query planning
vkonagar Apr 5, 2018
5e2cbff
Analyze tables was missing. Fixed it
vkonagar Apr 6, 2018
fcfe058
fix the query
vkonagar Apr 6, 2018
04e49f8
add comments, fix some code style
vkonagar Apr 6, 2018
d62462b
Fix whatif API test
vkonagar Apr 8, 2018
2e19c1c
run formatter
sivaprasadsudhir Apr 8, 2018
ac653aa
Add index selection module skeleton
vkonagar Apr 9, 2018
4d44009
skeleton for admissible column parsing
vkonagar Apr 9, 2018
371fd38
adding cost model classes
sivaprasadsudhir Apr 9, 2018
c23cc36
cleanup and reorganize the code
sivaprasadsudhir Apr 10, 2018
4d694ec
Intermediate changes. Query parser not complete.
vkonagar Apr 10, 2018
a51fe84
Intermediate changes. Query parser not complete.
vkonagar Apr 10, 2018
d043128
removed cost model class
sivaprasadsudhir Apr 11, 2018
32f9040
Add IndexObject Pool
vkonagar Apr 11, 2018
324e430
Memoization support completed
sivaprasadsudhir Apr 11, 2018
5978d32
Complete query parser
vkonagar Apr 11, 2018
a24ded7
Complete query parser
vkonagar Apr 11, 2018
11bc159
multi column index, wip
sivaprasadsudhir Apr 11, 2018
e0cac79
Add tests for admissible indexes
vkonagar Apr 11, 2018
83c1b44
Fix what if index and admissive indexes test
vkonagar Apr 11, 2018
1e5925c
added outline for naive enumeration method
pbollimp Apr 11, 2018
4b463dc
Fix get admissible indexes test
vkonagar Apr 11, 2018
96a41b1
Fix get admissible indexes test
vkonagar Apr 11, 2018
12a343a
Added the IndexConfiguration set difference
pbollimp Apr 11, 2018
e98461a
Minor BUg Fix
sivaprasadsudhir Apr 11, 2018
1ec6f55
Split computing and getting const
sivaprasadsudhir Apr 11, 2018
d23d0dc
Fix compilation error and typos
vkonagar Apr 11, 2018
a94cac9
Finish Configuration Enumeration module
pbollimp Apr 11, 2018
11adba0
Fix the main index selection algorithm
vkonagar Apr 11, 2018
4c8dce7
Finish Merging
pbollimp Apr 12, 2018
6f67e0c
Merge
vkonagar Apr 12, 2018
aa63a5f
cleanup
sivaprasadsudhir Apr 12, 2018
f8a8180
Restructure code
vkonagar Apr 12, 2018
b619333
More refactoring
vkonagar Apr 12, 2018
d01d018
added comments to index selection context
sivaprasadsudhir Apr 12, 2018
d9d0cfc
Added the comparator for the candidate index enumeration
pbollimp Apr 12, 2018
d984e89
Adding comments
pbollimp Apr 12, 2018
11fdce2
Restructure generate candidate indexes
vkonagar Apr 12, 2018
afa1582
Fix merge
vkonagar Apr 12, 2018
3178695
partial test for multi columnindex generation
sivaprasadsudhir Apr 12, 2018
5f4a822
Add candidate index gen test
vkonagar Apr 12, 2018
fd2de46
Minor change to ComputeCost. Formatting and comments.
pbollimp Apr 12, 2018
3db49a7
Add comments
vkonagar Apr 12, 2018
b7c4f9c
comments
sivaprasadsudhir Apr 12, 2018
756ecb8
More formatting and comments.
pbollimp Apr 12, 2018
0d336d0
more comments
vkonagar Apr 12, 2018
f58cf77
brief comments.
pbollimp Apr 12, 2018
213a351
rename pl_assert to peloton_assert
sivaprasadsudhir Apr 12, 2018
e846956
Remove GetCost and rename ComputeCost to GetCost
pbollimp Apr 12, 2018
85705dd
fix multicolumnindex generation
sivaprasadsudhir Apr 12, 2018
920083a
minor fixes
sivaprasadsudhir Apr 12, 2018
93b2214
Fix admissible index and candidate pruning tests
vkonagar Apr 13, 2018
e3b43d0
Fix unused variables
vkonagar Apr 13, 2018
c907ef3
Add more tests to WhatIfAPI and IndexSelection
vkonagar Apr 16, 2018
342f6a3
Implement the suggestions mentioned in the code review
vkonagar Apr 16, 2018
c54f4e0
Uncomment the choose best plan call
vkonagar Apr 16, 2018
39259fb
Fix tests
vkonagar Apr 23, 2018
f323ed9
Add support for multi-column index
chenboy Apr 1, 2018
6330ab6
Fix conflicts after merge
chenboy May 2, 2018
b291f58
nit fixes
sivaprasadsudhir May 3, 2018
f4ce787
Fix what-if index tests
vkonagar May 4, 2018
c6915f7
Add more multi-column index sets in the test cases.
vkonagar May 4, 2018
49b95df
Add testing utility class for index suggestion tests
vkonagar May 4, 2018
a6da36d
Add to cmake for the files in the previous commit
vkonagar May 4, 2018
01c994e
Modify what-if tests to use the utility class
vkonagar May 4, 2018
e1dad43
Fix formatting
vkonagar May 4, 2018
90e7d65
Code review fix
vkonagar May 4, 2018
57c1c83
fix tests
sivaprasadsudhir May 4, 2018
4b4e256
nit
sivaprasadsudhir May 4, 2018
61786ae
Fix memory leaks and misc nit fixes
vkonagar May 5, 2018
fa1dbba
fixed the test temportarily for the index bug
sivaprasadsudhir May 5, 2018
6bbaa94
Rename IndexObject to HypotheticalIndexObject
vkonagar May 5, 2018
5591755
debugging the shared pointer issue
sivaprasadsudhir May 5, 2018
5d0d2b8
Fix segfault. Some more Renames
vkonagar May 5, 2018
28e818b
check the exact indexes
sivaprasadsudhir May 5, 2018
8fd0bf4
Fix the tests to use the util
vkonagar May 5, 2018
3f394f7
fixing the index selection
sivaprasadsudhir May 5, 2018
8f1b897
Fix formatting
vkonagar May 5, 2018
40576fe
Rebase and fix conflicts while rebasing
vkonagar May 5, 2018
10843ca
latest tests
sivaprasadsudhir May 5, 2018
3085a58
Better tests
sivaprasadsudhir May 6, 2018
1e9b959
Add get workload support to the testing utility class.
vkonagar May 6, 2018
55354b9
Fix stray
vkonagar May 6, 2018
96f500b
Comment out the debug code in optimizer
vkonagar May 6, 2018
eb3da24
Add index suggestion task skeleton
vkonagar May 7, 2018
2657e76
Add query history catalog GET methods.
vkonagar May 7, 2018
a564372
Fix formatting
vkonagar May 7, 2018
9f5bdc5
Update index suggestion task
vkonagar May 8, 2018
e290797
Add new workload
vkonagar May 8, 2018
57955b4
Add new test - incomplete
vkonagar May 8, 2018
ecec9ce
Add more than 3 columns cost model test
vkonagar May 8, 2018
4e3370c
Fix join query parsing for table name extraction
vkonagar May 8, 2018
818c583
Add more queries to workload D
vkonagar May 8, 2018
e4865c4
DEBUG -> TRACE
vkonagar May 8, 2018
53c1101
Changed the columns from a set to vector
sivaprasadsudhir May 8, 2018
ae3e26b
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir May 8, 2018
7152d46
Fix compilation error
vkonagar May 8, 2018
0062cc5
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir May 8, 2018
fee2bea
Complete the index suggestion task - RPC is pending.
vkonagar May 8, 2018
4642b34
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar May 8, 2018
490677f
Get args at RPC handler
vkonagar May 8, 2018
51d7f56
Refactored the tests
sivaprasadsudhir May 8, 2018
fc0d60e
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir May 8, 2018
a48e085
Fix compilation issue and list serialization
vkonagar May 8, 2018
a3ac507
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar May 8, 2018
f6b18d0
Complete RPC handler
vkonagar May 8, 2018
eb5239f
fix logs
sivaprasadsudhir May 8, 2018
693516b
Fix compilation error in peloton-bin
vkonagar May 8, 2018
6017790
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar May 8, 2018
b024304
Add dropIndex RPC
vkonagar May 9, 2018
8b2169c
run brain and server together in one process for testing
sivaprasadsudhir May 9, 2018
f718511
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir May 9, 2018
8639124
MOved tunable knobs into a separate structure
sivaprasadsudhir May 9, 2018
3a5227a
changed the arguments of the constructor
sivaprasadsudhir May 9, 2018
aeabd94
completed the refactor
sivaprasadsudhir May 9, 2018
7ee9b0f
Fix index selection job -- rename some stuff
vkonagar May 9, 2018
99be940
Merge branch 'auto_index' of github.com:sivaprasadsudhir/peloton into…
vkonagar May 9, 2018
1e3cd9c
minor style changes
sivaprasadsudhir May 9, 2018
bd4593b
Rename more stuff
vkonagar May 9, 2018
5fe0108
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar May 9, 2018
a8af555
More renames
vkonagar May 9, 2018
273b89b
Fix DML statement handling in workload
vkonagar May 9, 2018
7091c7f
Fix cost model bug for more than 2 column indexes
vkonagar May 9, 2018
67ff655
Add an extensive test on multi-column optimizer cost model test
vkonagar May 9, 2018
51139e6
concrete test case to show the issues with non-deterministic set of i…
sivaprasadsudhir May 9, 2018
f9b2c5e
Add drop indexes RPC
vkonagar May 9, 2018
cb8d209
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir May 9, 2018
3c3559e
Run formatter
vkonagar May 9, 2018
2da21af
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar May 9, 2018
71d4213
Fix drop indexes
vkonagar May 9, 2018
17ee0be
merged with master, but can't pass even plan_util_test
Blade-Lee May 9, 2018
59e20dc
merged with siva's auto_index branch
Blade-Lee May 10, 2018
69d6c2f
passed plan_util_test
Blade-Lee May 10, 2018
7d6fc37
Fix a bug in config enumeration for case where no index is better
pbollimp May 10, 2018
ec4951c
now passing plan_util_test, compressed_idx_config_test and lspi_test
Blade-Lee May 10, 2018
6d48e80
Fix formatter issue
vkonagar May 10, 2018
d22b7bb
Merge remote-tracking branch 'origin/auto_index' into auto_index
vkonagar May 10, 2018
1060627
Fix travis error
vkonagar May 10, 2018
0b12801
Fix the test that is failing non-deteministically due to the optimize…
pbollimp May 10, 2018
5029ed1
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
pbollimp May 10, 2018
1e31d2a
Use only one transaction for the entire run of the job. Also, generat…
vkonagar May 10, 2018
8b937da
hopefully, final version of the algorithm
sivaprasadsudhir May 11, 2018
b8e2afa
CompressedIndexRepresentation changes for MultiCol order issue
saatviks May 10, 2018
f8262cd
added multiple choices for the output
sivaprasadsudhir May 11, 2018
f4bca42
more index selection tests
sivaprasadsudhir May 11, 2018
4c37855
Add missing populate index
vkonagar May 11, 2018
38757ac
Consider non-equality predicates for index scan in the cost model
chenboy May 10, 2018
4792d91
Drop the indexes only if it is not suggested this time
vkonagar May 11, 2018
5460082
fixed precision issues
sivaprasadsudhir May 11, 2018
3b757f1
Merge branch 'auto_index' of https://github.com/sivaprasadsudhir/pelo…
sivaprasadsudhir May 11, 2018
6624be9
Addressing Review comments
saatviks May 11, 2018
c02c7ec
removed redundant headers
Blade-Lee May 11, 2018
2f370ce
Merge remote-tracking branch 'origin/brain_rl_testing_framework' into…
Blade-Lee May 11, 2018
5c3b188
code change according to PR's comments
Blade-Lee May 11, 2018
b64cacf
added TODOs
Blade-Lee May 11, 2018
b7373e5
using TestingIndexSuggestionUtil for the workload now
Blade-Lee May 12, 2018
461b3df
added what_if API to get cost
Blade-Lee May 12, 2018
8bc5170
minor fixes
sivaprasadsudhir May 12, 2018
229f456
added ToIndexConfiguration()
Blade-Lee May 12, 2018
51f5a1a
Fix the AnalyzeStats crash
vkonagar May 12, 2018
5c322c1
Fix: Index Selection returns empty set because the
vkonagar May 12, 2018
3ef9128
Fix a bug during where clause parsing to make it work with TPCC
pbollimp May 12, 2018
146100d
Fix the compilation error
vkonagar May 12, 2018
d805950
added max_index_size as member of lspi_tuner
Blade-Lee May 12, 2018
6c0ee06
Fixes + End to end testing
saatviks May 12, 2018
be3e299
added permutaion to AddCandidates() & test case
Blade-Lee May 12, 2018
065480f
Merge branch 'brain_rl_testing_framework' of github.com:Blade-Lee/pel…
Blade-Lee May 12, 2018
d250fbe
Address some of the code review comments
pbollimp May 12, 2018
3230ec3
Fix create/drop index -- running TPCC
vkonagar May 13, 2018
3d111cc
Fix for 'ToIndexConfiguration'
saatviks May 13, 2018
2a81b9e
Merge branch 'auto_index' into brain_rl_testing_framework
saatviks May 13, 2018
5dd7da7
Testing util additions
saatviks May 13, 2018
b704f01
Cyclic workload setup
saatviks May 13, 2018
2c68703
Additional test cases + Error Analysis
saatviks May 13, 2018
e2b08c6
fixed AddCandidates() empty index bug
Blade-Lee May 13, 2018
409cfd7
LSPI Tuning bug fixes
saatviks May 13, 2018
61a5c96
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 13, 2018
c0f0443
LSPI Test additions for easier analysis
saatviks May 13, 2018
62e9f29
added dry_run flag
Blade-Lee May 14, 2018
1056b85
modified AdjustIndexes() accordingly
Blade-Lee May 14, 2018
4393db3
Added timing + Setting up exhaustive what-if search(ideal)
saatviks May 14, 2018
e639bfa
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 14, 2018
6341b20
Added exhaustive what-if search
saatviks May 14, 2018
8b81661
added index_add/drop counter for AdjustIndexes()
Blade-Lee May 14, 2018
08f91ac
Merge branch 'brain_rl_testing_framework' of github.com:Blade-Lee/pel…
Blade-Lee May 14, 2018
dafa9ed
added Non-Exhaustive LSPI measurement
Blade-Lee May 14, 2018
91a5ad8
added add/drop index counter for LSPI measurement
Blade-Lee May 14, 2018
38fc86d
added What-If exhaustive no-dropping test
Blade-Lee May 14, 2018
3bb2c64
Simple rearrangements
saatviks May 14, 2018
ac2350d
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks May 14, 2018
ef776d6
adjusted test cases
Blade-Lee May 14, 2018
522b155
added the files for cost evaluation
pbollimp Mar 29, 2018
dcee1cc
Merge branch 'auto_index' into brain_rl_testing_framework
saatviks Jun 12, 2018
375a794
Code/Tests cleanup
saatviks Jun 13, 2018
885d721
Merge branch 'brain_rl_testing_framework' of https://github.com/Blade…
saatviks Jun 13, 2018
19eb56e
Merge remote-tracking branch 'upstream/master' into brain_rl_testing_…
saatviks Jun 13, 2018
f1efdcd
Setting up changes for running TPCC
saatviks Jun 15, 2018
58a4ab5
Reverting to point where things working correctly
saatviks Jul 12, 2018
2706435
Hacky commit for online LSPI Index suggestion
saatviks Jul 23, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
488 changes: 488 additions & 0 deletions src/brain/index_selection.cpp

Large diffs are not rendered by default.

23 changes: 23 additions & 0 deletions src/brain/index_selection_context.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
//===----------------------------------------------------------------------===//
//
// Peloton
//
// index_selection_context.cpp
//
// Identification: src/brain/index_selection_context.cpp
//
// Copyright (c) 2015-2018, Carnegie Mellon University Database Group
//
//===----------------------------------------------------------------------===//

#include "brain/index_selection_context.h"
#include "common/logger.h"

namespace peloton {
namespace brain {

IndexSelectionContext::IndexSelectionContext(IndexSelectionKnobs knobs)
: knobs_(knobs) {}

} // namespace brain
} // namespace peloton
189 changes: 189 additions & 0 deletions src/brain/index_selection_job.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
//===----------------------------------------------------------------------===//
//
// Peloton
//
// index_selection_job.cpp
//
// Identification: src/brain/index_selection_job.cpp
//
// Copyright (c) 2015-2018, Carnegie Mellon University Database Group
//
//===----------------------------------------------------------------------===//

#include "brain/index_selection_util.h"
#include "brain/index_selection_job.h"
#include "brain/index_selection.h"
#include "catalog/query_history_catalog.h"
#include "catalog/system_catalogs.h"
#include "optimizer/stats/stats_storage.h"

namespace peloton {
namespace brain {

void IndexSelectionJob::OnJobInvocation(BrainEnvironment *env) {
LOG_INFO("Started Index Suggestion Task");

auto &txn_manager = concurrency::TransactionManagerFactory::GetInstance();
auto txn = txn_manager.BeginTransaction();

// Analyze stats for all the tables.
// TODO: AnalyzeStatsForAllTables crashes sometimes.
// optimizer::StatsStorage *stats_storage =
// optimizer::StatsStorage::GetInstance();
// ResultType stats_result = stats_storage->AnalyzeStatsForAllTables(txn);
// if (stats_result != ResultType::SUCCESS) {
// LOG_ERROR(
// "Cannot generate stats for table columns. Not performing index "
// "suggestion...");
// txn_manager.AbortTransaction(txn);
// return;
// }

// Query the catalog for new SQL queries.
// New SQL queries are the queries that were added to the system
// after the last_timestamp_
auto &query_catalog = catalog::QueryHistoryCatalog::GetInstance(txn);
auto query_history =
query_catalog.GetQueryStringsAfterTimestamp(last_timestamp_, txn);
if (query_history->size() > num_queries_threshold_) {
LOG_INFO("Tuning threshold has crossed. Time to tune the DB!");

// Run the index selection.
std::vector<std::string> queries;
for (auto query_pair : *query_history) {
queries.push_back(query_pair.second);
}

// TODO: Handle multiple databases
brain::Workload workload(queries, DEFAULT_DB_NAME, txn);
LOG_INFO("Knob: Num Indexes: %zu",
env->GetIndexSelectionKnobs().num_indexes_);
LOG_INFO("Knob: Naive: %zu",
env->GetIndexSelectionKnobs().naive_enumeration_threshold_);
LOG_INFO("Knob: Num Iterations: %zu",
env->GetIndexSelectionKnobs().num_iterations_);
brain::IndexSelection is = {workload, env->GetIndexSelectionKnobs(), txn};
brain::IndexConfiguration best_config;
is.GetBestIndexes(best_config);

if (best_config.IsEmpty()) {
LOG_INFO("Best config is empty. No new indexes this time...");
}

// Get the index objects from database.
auto database_object = catalog::Catalog::GetInstance()->GetDatabaseObject(
DEFAULT_DB_NAME, txn);
auto pg_index = catalog::Catalog::GetInstance()
->GetSystemCatalogs(database_object->GetDatabaseOid())
->GetIndexCatalog();
auto cur_indexes = pg_index->GetIndexObjects(txn);
auto drop_indexes = GetIndexesToDrop(cur_indexes, best_config);

// Drop useless indexes.
for (auto index : drop_indexes) {
LOG_DEBUG("Dropping Index: %s", index->GetIndexName().c_str());
DropIndexRPC(database_object->GetDatabaseOid(), index.get());
}

// Create new indexes.
for (auto index : best_config.GetIndexes()) {
CreateIndexRPC(index.get());
}

last_timestamp_ = GetLatestQueryTimestamp(query_history.get());
} else {
LOG_INFO("Index Suggestion - not performing this time");
}
txn_manager.CommitTransaction(txn);
}

std::vector<std::shared_ptr<catalog::IndexCatalogObject>>
IndexSelectionJob::GetIndexesToDrop(
std::unordered_map<oid_t, std::shared_ptr<catalog::IndexCatalogObject>>
&index_objects,
brain::IndexConfiguration best_config) {
std::vector<std::shared_ptr<catalog::IndexCatalogObject>> ret_indexes;
// Get the existing indexes and drop them.
for (auto index : index_objects) {
auto index_name = index.second->GetIndexName();
// TODO [vamshi]: REMOVE THIS IN THE FINAL CODE
// This is a hack for now. Add a boolean to the index catalog to
// find out if an index is a brain suggested index/user created index.
if (index_name.find(brain_suggested_index_prefix_str) !=
std::string::npos) {
bool found = false;
for (auto installed_index : best_config.GetIndexes()) {
if ((index.second.get()->GetTableOid() ==
installed_index.get()->table_oid) &&
(index.second.get()->GetKeyAttrs() ==
installed_index.get()->column_oids)) {
found = true;
}
}
// Drop only indexes which are not suggested this time.
if (!found) {
ret_indexes.push_back(index.second);
}
}
}
return ret_indexes;
}

void IndexSelectionJob::CreateIndexRPC(brain::HypotheticalIndexObject *index) {
// TODO: Remove hardcoded database name and server end point.
capnp::EzRpcClient client("localhost:15445");
PelotonService::Client peloton_service = client.getMain<PelotonService>();

// Create the index name: concat - db_id, table_id, col_ids
std::stringstream sstream;
sstream << brain_suggested_index_prefix_str << "_" << index->db_oid << "_"
<< index->table_oid << "_";
std::vector<oid_t> col_oid_vector;
for (auto col : index->column_oids) {
col_oid_vector.push_back(col);
sstream << col << "_";
}
auto index_name = sstream.str();

auto request = peloton_service.createIndexRequest();
request.getRequest().setDatabaseOid(index->db_oid);
request.getRequest().setTableOid(index->table_oid);
request.getRequest().setIndexName(index_name);
request.getRequest().setUniqueKeys(false);

auto col_list =
request.getRequest().initKeyAttrOids(index->column_oids.size());
for (auto i = 0UL; i < index->column_oids.size(); i++) {
col_list.set(i, index->column_oids[i]);
}

PELOTON_ASSERT(index->column_oids.size() > 0);
auto response = request.send().wait(client.getWaitScope());
}

void IndexSelectionJob::DropIndexRPC(oid_t database_oid,
catalog::IndexCatalogObject *index) {
// TODO: Remove hardcoded database name and server end point.
// TODO: Have to be removed when merged with tli's code.
capnp::EzRpcClient client("localhost:15445");
PelotonService::Client peloton_service = client.getMain<PelotonService>();

auto request = peloton_service.dropIndexRequest();
request.getRequest().setDatabaseOid(database_oid);
request.getRequest().setIndexOid(index->GetIndexOid());

auto response = request.send().wait(client.getWaitScope());
}

uint64_t IndexSelectionJob::GetLatestQueryTimestamp(
std::vector<std::pair<uint64_t, std::string>> *queries) {
uint64_t latest_time = 0;
for (auto query : *queries) {
if (query.first > latest_time) {
latest_time = query.first;
}
}
return latest_time;
}
}
}
167 changes: 167 additions & 0 deletions src/brain/index_selection_job_lspi.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
//===----------------------------------------------------------------------===//
//
// Peloton
//
// index_selection_job.cpp
//
// Identification: src/brain/index_selection_job.cpp
//
// Copyright (c) 2015-2018, Carnegie Mellon University Database Group
//
//===----------------------------------------------------------------------===//

#include "brain/indextune/lspi/lspi_tuner.h"
#include "brain/index_selection_job_lspi.h"
#include "catalog/query_history_catalog.h"
#include "catalog/system_catalogs.h"
#include "optimizer/stats/stats_storage.h"

namespace peloton {
namespace brain {

bool IndexSelectionJobLSPI::enable_ = false;

IndexSelectionJobLSPI::IndexSelectionJobLSPI(BrainEnvironment *env, uint64_t num_queries_threshold)
: BrainJob(env),
last_timestamp_(0),
num_queries_threshold_(num_queries_threshold) {}

void IndexSelectionJobLSPI::OnJobInvocation(UNUSED_ATTRIBUTE BrainEnvironment *env) {
LOG_INFO("Started Index Suggestion Task");
if (!enable_) {
LOG_INFO("Index Suggestion - not performing this time..Yet to be enabled");
return;
}

auto &txn_manager = concurrency::TransactionManagerFactory::GetInstance();
auto txn = txn_manager.BeginTransaction();

// Analyze stats for all the tables.
// TODO: AnalyzeStatsForAllTables crashes sometimes.
// optimizer::StatsStorage *stats_storage =
// optimizer::StatsStorage::GetInstance();
// ResultType stats_result = stats_storage->AnalyzeStatsForAllTables(txn);
// if (stats_result != ResultType::SUCCESS) {
// LOG_ERROR(
// "Cannot generate stats for table columns. Not performing index "
// "suggestion...");
// txn_manager.AbortTransaction(txn);
// return;
// }



// Query the catalog for new SQL queries.
// New SQL queries are the queries that were added to the system
// after the last_timestamp_
auto &query_catalog = catalog::QueryHistoryCatalog::GetInstance(txn);
auto query_history =
query_catalog.GetQueryStringsAfterTimestamp(last_timestamp_, txn);
if (query_history->size() > num_queries_threshold_) {
LOG_INFO("Tuning threshold has crossed. Time to tune the DB!");

// Run the index selection.
std::vector<std::string> queries;
std::vector<double> query_latencies;
for (auto query_pair : *query_history) {
queries.push_back(query_pair.second);
}

if(!tuner_initialized_ && queries.size() > 0) {
tuner_initialized_ = true;
std::set<oid_t> ignore_table_oids;
CompressedIndexConfigUtil::GetIgnoreTables(DEFAULT_DB_NAME,
ignore_table_oids);
tuner_ = std::unique_ptr<LSPIIndexTuner>(new LSPIIndexTuner(DEFAULT_DB_NAME,
ignore_table_oids,
CandidateSelectionType::Simple,
3));
}

if(tuner_initialized_) {
auto container = CompressedIndexConfigUtil::ToIndexConfiguration(*tuner_->GetConfigContainer());
for(auto query: queries) {
auto query_latency = brain::CompressedIndexConfigUtil::WhatIfIndexCost(query,
container,
DEFAULT_DB_NAME);
query_latencies.push_back(query_latency);
LOG_DEBUG("Query: %s, What-If cost: %.5f", query.c_str(), query_latency);
}
// Run the tuner
std::set<std::shared_ptr<brain::HypotheticalIndexObject>> add_set, drop_set;
tuner_->Tune(queries, query_latencies, add_set, drop_set);
for(auto &index: add_set) {
LOG_DEBUG("Adding Index: %s", index->ToString().c_str());
CreateIndexRPC(index.get());
}
// Skip dropping for now
// for(auto &drop_index: drop_set) {
// LOG_DEBUG("Adding Index: %s", index->ToString().c_str());
// DropIndexRPC(drop_index.get());
// }
}
last_timestamp_ = GetLatestQueryTimestamp(query_history.get());
} else {
LOG_INFO("Index Suggestion - not performing this time");
}
txn_manager.CommitTransaction(txn);
}

void IndexSelectionJobLSPI::CreateIndexRPC(brain::HypotheticalIndexObject *index) {
// TODO: Remove hardcoded database name and server end point.
capnp::EzRpcClient client("localhost:15445");
PelotonService::Client peloton_service = client.getMain<PelotonService>();

// Create the index name: concat - db_id, table_id, col_ids
std::stringstream sstream;
sstream << brain_suggested_index_prefix_str << "_" << index->db_oid << "_"
<< index->table_oid << "_";
std::vector<oid_t> col_oid_vector;
for (auto col : index->column_oids) {
col_oid_vector.push_back(col);
sstream << col << "_";
}
auto index_name = sstream.str();

auto request = peloton_service.createIndexRequest();
request.getRequest().setDatabaseOid(index->db_oid);
request.getRequest().setTableOid(index->table_oid);
request.getRequest().setIndexName(index_name);
request.getRequest().setUniqueKeys(false);

auto col_list =
request.getRequest().initKeyAttrOids(index->column_oids.size());
for (auto i = 0UL; i < index->column_oids.size(); i++) {
col_list.set(i, index->column_oids[i]);
}

PELOTON_ASSERT(index->column_oids.size() > 0);
auto response = request.send().wait(client.getWaitScope());
}

void IndexSelectionJobLSPI::DropIndexRPC(oid_t database_oid,
catalog::IndexCatalogObject *index) {
// TODO: Remove hardcoded database name and server end point.
// TODO: Have to be removed when merged with tli's code.
capnp::EzRpcClient client("localhost:15445");
PelotonService::Client peloton_service = client.getMain<PelotonService>();

auto request = peloton_service.dropIndexRequest();
request.getRequest().setDatabaseOid(database_oid);
request.getRequest().setIndexOid(index->GetIndexOid());

auto response = request.send().wait(client.getWaitScope());
}

uint64_t IndexSelectionJobLSPI::GetLatestQueryTimestamp(
std::vector<std::pair<uint64_t, std::string>> *queries) {
uint64_t latest_time = 0;
for (auto query : *queries) {
if (query.first > latest_time) {
latest_time = query.first;
}
}
return latest_time;
}
}
}
Loading