-
Notifications
You must be signed in to change notification settings - Fork 623
[15721] Blocking Schema Changes #1341
base: master
Are you sure you want to change the base?
Conversation
- added alter and rename plan and statements - modified optimizor to bind db name and make plans - modified parsenodes - added node to string support and vice versa - added nodetransform of alter table
- added alter and rename plan and statements - modified optimizor to bind db name and make plans - modified parsenodes - added node to string support and vice versa - added nodetransform of alter table
add alter executor
added change type, changed logic in alter_executor, varchar now has default length.
basic test for alterTable
added alter varchar length support, changed the plan to use schema
src/catalog/catalog.cpp
Outdated
// txn->RecordDrop(database_oid, old_table->GetOid(), INVALID_OID); | ||
|
||
// Final step of physical change should be moved to commit time | ||
database->ReplaceTableWithOid(table_oid, new_table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[memory leak]
old table object is not explicitly freed.
database->ReplaceTableWithOid just replace the table pointer to new_table but old_table still exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out! We left a TODO here, and we are still looking for the correct way to handle this.
new_table->AddIndex(new_index); | ||
|
||
// reinsert record into pg_index | ||
pg_index->InsertIndex( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[race condition]
It is risky to manipulate index directly under multiple transactions, because it is not protected by any consistency control mechanism like locks. It is better to do such operations through an executor just like your step 4(use a SeqScanPlan). The index executor with lock mechanism will take care of consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually InsertIndex uses a InsertPlan to insert a new tuple into pg_attribute so we can get concurrency control for free.
for (oid_t old_column_id = 0; | ||
old_column_id < old_schema->GetColumnCount(); old_column_id++) { | ||
old_column_ids.push_back(old_column_id); | ||
for (oid_t new_column_id = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[inefficient implementation]
It is not efficient to get a matched name in a for loop. It is better to use a hashmap to do such a thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, thanks for your suggestion.
auto new_column_name = node.GetNewName(); | ||
auto old_column_name = node.GetOldName(); | ||
|
||
ResultType result = catalog::Catalog::GetInstance()->RenameColumn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[race condition]
If there are two transactions trying to rename a column in a table at the same time, it is not safe to do operations on catalog instance without locks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We planed to add lock logic here and in the alterTable, should we merge your lock manager and executor implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
Although our current implementation of lock manager is correct, we are still working on it to support more functions. Most interface may not change in future, but I suggest that you keep an eye on our future pull request to guarantee your codes can run on our latest version of lock manager and index executors.
If you find any bugs in our lock manager or index executor, please feel free to contact us. We are pleasure to fix those problems.
oldName(nullptr), | ||
newName(nullptr) { | ||
dropped_names = | ||
type == AlterTableType::RENAME ? nullptr : (new std::vector<char *>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[code style]
It seems better to bracket the judgement in the clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, thanks for pointing out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General Questions:
- The codes seem to work. They modified postgres parser, planner, executor, and catelog, which is the typical information flow to change schema. Some places need to add locks to protect consistency under multiple transactions.
- All the codes are easily understood because of detailed comments and clear names.
- There are a few redundant codes like using for loop to find a match string, but most of codes are neat.
- The codes are modular and they re-use other modules properly.
- There exist some commented out codes, like some test codes, but I believe they retain those codes to debug in future.
- They use debug log functions properly and most of them are at TRACE level, which is suitable to avoid too many debug info.
Documentation Questions:
- Comments describe intents of codes correctly.
- All important functions are commented. Although some functions are not commented, their names are good enough suggestions.
- No unusual behavior observed.
- No other third-party libraries.
- Garbage collection for the old table is not completed.
Testing Questions:
- They have tests and they are comprehensive on single transaction operations, but they do not test concurrent situations.
- Their test is valid because they use JDBC to send SQL instructions and the parser in peloton will convert it into plans to execute.
- No hardcoded answers.
- They tests cover all their basic functions like rename, type change etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Major]: Major issues need to notice
[Minor]: Minor issues need to notice
[Trivial]: Suggestions
Improvements:
+ Much better use of comments
+ Much more test cases
+ Overall LGTM
Documentation:
+ Proper use of TODO
+ No incomplete code
+ Good comments
Tests:
+ Test exists and comprehensive
- Some tests still fails
Generally better coding style, keep it up!
public void test_RenameCol_Base() throws SQLException { | ||
conn.createStatement().execute(SQL_RENAME_COLUMN); | ||
ResultSet rs = conn.createStatement().executeQuery(SQL_SELECT_STAR); | ||
rs.next(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[suggestion] If rs doesn't have next, this will through an exception that is not easy to catch. Maybe add some more handling here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we should use try-catch here.
conn.commit(); | ||
conn2.commit(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[minor] Still commented out code, need to address this in the next PR
new int [] {5, 400}); | ||
assertNoMoreRows(rs); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[comment] great tests!
// ALTER TABLE | ||
//===--------------------------------------------------------------------===// | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[good] Good comments!
|
||
if (txn == nullptr) | ||
throw CatalogException("Alter table requires transaction"); | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[minor] Why double try catch block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we'll fix this.
@@ -26,8 +26,8 @@ | |||
|
|||
// Fix for PRId64 (See https://stackoverflow.com/a/18719205) | |||
#if defined(__cplusplus) && !defined(__STDC_FORMAT_MACROS) | |||
#define __STDC_FORMAT_MACROS 1 // Not sure where to put this | |||
#endif | |||
#define __STDC_FORMAT_MACROS 1 // Not sure where to put this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[trivial] Is this a auto-formatting change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
|
||
class AlterExecutor : public AbstractExecutor { | ||
public: | ||
AlterExecutor(const AlterExecutor &) = delete; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[good] Nice use of delete
/** | ||
* @class AlterTableStatement | ||
* @brief Represents "ALTER TABLE add column COLUMN_NAME COLUMN_TYPE" | ||
* TODO: add implementation of AlterTableStatement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[trivial] maybe you forgot to remove todo...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out!
this->table_name.c_str(), | ||
this->database_name.c_str()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[trivial] Maybe putting longer functions to .cpp file
if ((*parse_tree->added_columns)[i].get()->not_null) { | ||
catalog::Constraint constraint(ConstraintType::NOTNULL, | ||
"con_not_null"); | ||
column.AddConstraint(constraint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[trivial] Easy-to-mixup variable names column and columns
conn2.commit(); | ||
} | ||
|
||
// The following tests are currently broken. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen when running these test cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the system will crash due to infinite retry in the binder.
conn.createStatement().execute(sql); | ||
ResultSet rs = conn.createStatement().executeQuery(SQL_SELECT_STAR); | ||
rs.next(); | ||
checkRow(rs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this check can't ensure there is only one column left.
for (auto index_oid_pair : old_index_oids) { | ||
oid_t index_oid = index_oid_pair.first; | ||
// delete record in pg_index | ||
pg_index->DeleteIndex(index_oid, txn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not delete index in pg_index only if not all indexed columns still exists?
src/catalog/catalog.cpp
Outdated
txn); | ||
column_offset++; | ||
} | ||
// TODO: Add gc logic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you guys finished this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're still working on this, sorry about that.
auto old_schema = old_table->GetSchema(); | ||
std::vector<oid_t> column_ids; | ||
|
||
// Step 1: remove drop columns from old schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good strategy to remove dropped columns.
bool missing_ok; /* skip error if table missing */ | ||
} AlterTableStmt; | ||
|
||
typedef enum AlterTableType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very comprehensive types.
const std::unique_ptr<catalog::Schema> &added_columns, | ||
const std::unique_ptr<catalog::Schema> &changed_type_columns, | ||
AlterType a_type); | ||
explicit AlterPlan(const std::string &database_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you guys have changed rename columns from using vector to string, why still vector here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, will correct it. Thanks for pointing out.
Overview:
This PR contains all blocking schema changes implementation for Add/Drop column, rename column, and change column type. Alter table for Add/Drop column and change column type operations could be aggregated in one single SQL statement; rename column is separated alone.
Implementation Changes:
1.Functional code: We construct a new schema object for target table in parser, this object includes all changes related to this alter statement. Executor will execute the alter plan and invoke catalog to replace old table with new table using the new schema object. Catalog will use a scan to apply the changes to all tuples.
2.JUnit test: add unit test for add/drop column, change column type.
Known issues:
1.Multiple transactions bugs.
2.We don't handle drop column contains foreign key in this PR.