Skip to content

Commit

Permalink
[Fix](checker) Fixed infinite loop after internal error in checker. (#…
Browse files Browse the repository at this point in the history
…44479)


When the checker encounters an internal error, such as a transaction
conflict, the return value will be less than 0 and the function will
return immediately, but the related instance will not be removed from
the map. Additionally, if the return value of the do check is not 0, the
inverted check will not be performed. This PR fixes both of these
issues.
  • Loading branch information
Yukang-Lian authored Nov 22, 2024
1 parent 047b324 commit 20d6710
Showing 1 changed file with 4 additions and 12 deletions.
16 changes: 4 additions & 12 deletions cloud/src/recycler/checker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -168,25 +168,17 @@ int Checker::start() {
auto ctime_ms =
duration_cast<milliseconds>(system_clock::now().time_since_epoch()).count();
g_bvar_checker_enqueue_cost_s.put(instance_id, ctime_ms / 1000 - enqueue_time_s);
ret = checker->do_check();
int ret1 = checker->do_check();

int ret2 = 0;
if (config::enable_inverted_check) {
if (ret == 0) {
ret = checker->do_inverted_check();
}
}

if (ret < 0) {
// If ret < 0, it means that a temporary error occurred during the check process.
// The check job should not be considered finished, and the next round of check job
// should be retried as soon as possible.
return;
ret2 = checker->do_inverted_check();
}

// If instance checker has been aborted, don't finish this job
if (!checker->stopped()) {
finish_instance_recycle_job(txn_kv_.get(), check_job_key, instance.instance_id(),
ip_port_, ret == 0, ctime_ms);
ip_port_, ret1 == 0 && ret2 == 0, ctime_ms);
}
{
std::lock_guard lock(mtx_);
Expand Down

0 comments on commit 20d6710

Please sign in to comment.