Skip to content
This repository has been archived by the owner on Sep 27, 2019. It is now read-only.

[15721] Logging and Recovery #1348

Open
wants to merge 95 commits into
base: master
Choose a base branch
from
Open

[15721] Logging and Recovery #1348

wants to merge 95 commits into from

Conversation

db-ol
Copy link
Contributor

@db-ol db-ol commented May 5, 2018

This PR implements logging and recovery for robustness to crashes.

Logging:

  • Create log records and write them into log buffers in TimestampOrderingTransactionManager as executing transactions. Submit the buffer when it is full and then acquire a new one for the current transaction.
  • Push the callback for committing transactions from the network level to the worker level. Set logging callbacks and return QUEUING in CommitTransaction and AbortTransaction.
  • Adopt delta logging for updates. Pass values_buf, values_size and offsets all the way down to PerformUpdate.
  • Use tokens to guarantee the FIFO order in the logical queue for multiple workers. The logic queue consists of a few sub-queues, each per worker. Buffers belonging to the same transaction will be put into the same sub-queue.
  • Support the default codegen engine only.

Recovery:

  • Implement two-pass recovery. The first pass filters out transactions that are not committed while the second one classifies recods by txn_id. After two passes, all the transactions are replayed.
  • Support Create Table, Create Database, Insert and Delete currently.

gvos94 and others added 30 commits February 20, 2018 16:20
2. removed logging for read-only TXN.
Conflicts:
	src/codegen/updater.cpp
	src/concurrency/timestamp_ordering_transaction_manager.cpp
	src/storage/data_table.cpp
	test/brain/query_logger_test.cpp
case LogRecordType::TRANSACTION_BEGIN:{
if(all_txns_.find(txn_id) != all_txns_.end()){
LOG_ERROR("Duplicate transaction");
PELOTON_ASSERT(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the beginning, upon log file reading failure, the StartRecovery writes a LOG_ERROR and returns. I am wondering whether it should behave the same here - return after writing LOG_ERROR instead of using PELOTON_ASSERT(false) to abort?

buf_curr += (record_len + sizeof(record_len));
buf_rem -= (record_len + sizeof(record_len));
} else {
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might directly log an error and abort/return instead of using break.

LOG_INFO("Replaying TXN_COMMIT");
}

else if(record_type==LogRecordType::TRANSACTION_ABORT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check might need to be moved to ParseFromDisk and files an error there.
If a TRANSACTION_ABORT record were found at then at this time, then the previous changes in this transaction would have been made (which should not be, if I got this correctly).
On the other hand, if the transaction could reach here, then it should also have a TRANSACTION_COMMIT record. I wonder if it is possible for a transaction to have both records. If not, then such check can be redundant.

}

// Pass 2
log_buffer_ = new char[log_buffer_size_];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the recovery needs to read the log file twice from the disk. Since log_buffer_ is constructed here after the first pass.
If this is the case, will it be better to populate the log_buffer_ in the first phase and save the second disk read in the second phase?

Copy link
Contributor

@nwang57 nwang57 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging looks good. recovery seems not finished yet. Add some tests for the recovery part so that correctness can be easily verified.

@@ -862,7 +862,9 @@ void PostgresProtocolHandler::ExecExecuteMessageGetResult(ResultType status) {
}

void PostgresProtocolHandler::GetResult() {
traffic_cop_->ExecuteStatementPlanGetResult();

// traffic_cop_->ExecuteStatementPlanGetResult();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this is commented out

current_txn->GetEpochId(), current_txn->GetTransactionId(),
current_txn->GetCommitId(), schema_oid);

record.SetOldItemPointer(location);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already set the old item pointer once during the log record constructor why do you need to set it explicitly?

LogRecordType::TUPLE_UPDATE, location, new_location, current_txn->GetEpochId(),
current_txn->GetTransactionId(), current_txn->GetCommitId(), schema_oid);

record.SetOldItemPointer(location);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already set the old item pointer once during the log record constructor why do you need to set it explicitly?

LogRecordType::TRANSACTION_COMMIT, current_txn->GetEpochId(),
current_txn->GetTransactionId(), current_txn->GetCommitId());

current_txn->GetLogBuffer()->WriteRecord(record);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the log_buffer exceeds its threshold after this write?

LogRecordType::TRANSACTION_ABORT, current_txn->GetEpochId(),
current_txn->GetTransactionId(), current_txn->GetCommitId());

current_txn->GetLogBuffer()->WriteRecord(record);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the log_buffer exceeds its threshold after this write?

stream->flush();

if(stream->fail()){
PELOTON_ASSERT(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not PELOTON_ASSERT(!stream->fail())? Do we really want to crash the system if the stream fails or is there any mechanism to rewrite the log to file?

}

// Pass 2
log_buffer_ = new char[log_buffer_size_];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This memory may not be freed

if(it->second.first != LogRecordType::TRANSACTION_COMMIT)
continue;

auto offset_pair = std::make_pair(curr_txn_offset, curr_txn_offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why making a pair of two identical size_t values?

@cmu-db cmu-db deleted a comment from nwang57 May 13, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants