Skip to content
This repository has been archived by the owner on Sep 27, 2019. It is now read-only.

[15721] TileGroup Compaction and GC Fixes #1349

Open
wants to merge 121 commits into
base: master
Choose a base branch
from

Conversation

dqrs
Copy link

@dqrs dqrs commented May 5, 2018

Project: TileGroup Compaction, TileGroup Freeing, and Garbage Collection Fixes

Our project has 3 main aspects:

  1. Enabling the Garbage Collector to free empty tile groups and reclaim their memory (Done)
  2. Merging sparsely occupied tile groups to save memory (Done)
  3. Fixing several important bugs in the Garbage Collector and Transaction Manager (Done)

Status

Completed 75% 100% and 125% goals

Summary

1) TileGroup Freeing
This PR enhances the Garbage Collector to free empty TileGroups when all of their tuple slots have been recycled. We also scanned the codebase for all points (outside of codegen) that fetch tile groups from the catalog without checking for a nullptr. TileGroups can no longer be assumed to live forever, so these checks must be done. We need to follow up with Prashanth to do a similar check in the codegen engine. It also created a new class, RecycleStack to replace the Garbage Collector’s recycle queues. This concurrent data structure offers a non-blocking, constant-time TryPop() to retrieve a recycled tuple slot, a blocking, constant-time Push(), and a concurrent RemoveAllWithKey() function that uses hand-over-hand locking. The performance of TryPop() is very important because TryPop() lies on the critical path for DataTable::Insert(). We had to write a new data structure instead of using the existing concurrent queue because we need to be able to iterate through and remove recycled slots from TileGroups that become immutable. None of the existing concurrent structures support concurrent iteration, insertion, and removal.

2) TileGroup Compaction
This PR adds a class called TileGroupCompactor that performs compaction of tile groups. Compaction is triggered by the GarbageCollector, which submits a CompactTileGroup() task to the MonoQueuePool when the majority of a TileGroup is recycled garbage. The exact fraction of garbage required to trigger compaction is determined by a global setting, settings:SettingId::compaction_threshold. Compaction can also be enabled/disabled via another setting, global setting settings:SettingId::compaction_enabled.

3) Garbage Collection Fixes
This PR resolves almost all of the issues identified in issue #1325.
It includes the results of a thorough correctness audit we performed on Peleton's garbage collection system. It includes a whole new test suite for the Transaction-Level Garbage Collector and several important bug fixes to the Garbage Collector and Transaction Manager.

GC Fixes Summary:

  • Added 14 tests to transaction_level_gc_manager_test.cpp to handle more complex GC scenarios.
  • Currently 4 of these tests still fail for polluting indexes with old keys, but we believe this will require more significant changes at the execution layer to resolve.
  • We believe we have resolved all of the tuple-level GC bugs and most of the index bugs.
  • We have disabled the 4 check that fail and will open a new issue describing those scenarios only.
  • GCManager::RecycleTupleSlot allows unused ItemPointers to be returned without going through the entire Unlink and Reclaim process.
  • Modified TOTransactionManager to pass tombstones created by deletes to the GCManager.
    *Modified DataTable's Insert to return the ItemPointer to the GCManager in the case of a failed insert.
  • Modified DataTable's InsertIntoIndexes to iterate through indexes and remove inserted keys in the event of a failure.
  • Modified GCManager's Unlink function to clean indexes from garbage created by COMMIT_DELETE, COMMIT_UPDATE, and ABORT_UPDATE.

Copy link

@suafeng suafeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall good job and code's quality seems to be good, I only add some comments about style and the design of functions.

} // namespace gc
} // namespace peloton


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may not need so many blank lines here.


// Used by GC Manager to add to recycle stack (can be slower)
void RecycleStack::Push(const ItemPointer &location) {

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice a queue was used before, so I wonder if the reason of using a stack instead of queue is it is easier to implement, since managing a stack only involves manipulate the head of the linked list?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here.


constexpr auto kMinPauseTime = std::chrono::microseconds(1);
constexpr auto kMaxPauseTime = std::chrono::microseconds(100000);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using #define might be a better choice here.


LOG_TRACE("Moving Visible Tuple id : %u, Physical Tuple id : %u ",
visible_tuple_id, physical_tuple_id);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this visible_tuple_id defined? I did not find it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

}

bool success = MoveTuplesOutOfTileGroup(table, tile_group);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find there is only one usage of function MoveTuplesOutOfTileGroup() and it is here. It seems that the tuples in the given tile group is always moved to the tile group's table. Then I think there may not be necessary to have a data table as argument in MoveTuplesOutOfTileGroup(), since we know that this table is this tile group's table.

ItemPointer location = recycle_stack->TryPop();
if (location.IsNull()) {
return INVALID_ITEMPOINTER;
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be better for this function GetRecycledTupleSlot() to have a while around TryPop() instead of having while loop around GetRecycledTupleSlot, since this function name seems it guarantees to get a tuple slot.

} // namespace gc
} // namespace peloton



Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may not need so many blank lines here.

Copy link

@ssyssy ssyssy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job and really solid code here. I only find some small issues here. Hope that my comments can help.


// Used by GC Manager to add to recycle stack (can be slower)
void RecycleStack::Push(const ItemPointer &location) {

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here.


constexpr auto kMinPauseTime = std::chrono::microseconds(1);
constexpr auto kMaxPauseTime = std::chrono::microseconds(100000);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving these constants somewhere in the gc namespace? not in the function. And I think that the max_attempts should also be defined as constants.


size_t attempts = 0;
size_t max_attempts = 100;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 100 attempts is way too much, you can probably try 50.

// ensure that this is the latest version
bool is_latest_version = tile_group_header->GetPrevItemPointer(physical_tuple_id).IsNull();
if (is_latest_version == false) {
LOG_TRACE("Skipping tuple, not latest version.");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why skip the tuple here? Why not just abort the transaction?


LOG_TRACE("Moving Visible Tuple id : %u, Physical Tuple id : %u ",
visible_tuple_id, physical_tuple_id);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

recycle_queue_map_.clear();
immutable_tile_group_queue_ = std::make_shared<LockFreeQueue<oid_t>>(INITIAL_TG_QUEUE_LENGTH);
recycle_stacks_ = std::make_shared<peloton::CuckooMap<
oid_t, std::shared_ptr<RecycleStack>>>(INITIAL_MAP_SIZE);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think recycle_stack_map is better for understanding.

"unique_btree_index", 1235, TEST_TABLE_OID, CATALOG_DATABASE_OID,
IndexType::BWTREE, IndexConstraintType::UNIQUE, tuple_schema,
key_schema, key_attrs, unique);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the 1235 here? Some random number?

/*
* @brief Set's Immutable Flag to True. Only used by the Garbage Collector
*/
inline bool SetImmutabilityWithoutNotifyingGC() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would this function be called?

@bohanjason
Copy link

As my teammates said, Nice Job ! !

@@ -154,6 +154,12 @@ bool SeqScanExecutor::DExecute() {
while (current_tile_group_offset_ < table_tile_group_count_) {
auto tile_group =
target_table_->GetTileGroup(current_tile_group_offset_++);

if (tile_group == nullptr) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused that tile_group can be nullptr in this seq scan executor, but be asserted not be null in the previous index and hybrid scan executor ?

//};
//
//} // namespace storage
//} // namespace peloton

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this file can be removed ?

@@ -34,6 +34,8 @@ void Inserter::Init(storage::DataTable *table,
char *Inserter::AllocateTupleStorage() {
location_ = table_->GetEmptyTupleSlot(nullptr);

PELOTON_ASSERT(location_.IsNull() == false);

// Get the tile offset assuming that it is in a tuple format
auto tile_group = table_->GetTileGroupById(location_.block);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert tilegroup != nullptr ?

@mbutrovich mbutrovich force-pushed the gc_fixes branch 4 times, most recently from a10a5ad to 36d546d Compare May 25, 2018 19:32
@coveralls
Copy link

coveralls commented May 25, 2018

Coverage Status

Coverage increased (+0.4%) to 77.489% when pulling 002881b on mbutrovich:gc_fixes into 4d61826 on cmu-db:master.

mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jun 29, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jun 29, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jun 29, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jun 29, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jun 29, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 2, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 2, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 2, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 2, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 2, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 3, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 5, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 5, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 5, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 5, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 5, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 10, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 10, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 10, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 10, 2018
mbutrovich added a commit to mbutrovich/peloton that referenced this pull request Jul 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants