From 7bfc6ca1e1986f6f97f84cbd1f6eb319fb734720 Mon Sep 17 00:00:00 2001 From: Nash Kaminski <36900518+gs-kamnas@users.noreply.github.com> Date: Mon, 12 Feb 2024 11:06:43 -0600 Subject: [PATCH] Prevent potential busy loop in scheduler from jobs > nodes (#3060) * Prevent potential busy loop in scheduler This change remediates a potential busy-loop in the scheduler which results from either next_adds_job allowing the job count to exceed the node count or a retry after configuration fetch failure causing the same invalid state. * Update changelog * Summarize / simplify the job class increment function --- CHANGELOG.md | 1 + lib/oxidized/jobs.rb | 12 +++++++++++- lib/oxidized/nodes.rb | 2 +- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 1fc05e805..15561e511 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -53,6 +53,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). - Fixed login into Fortigate when post-login-baned ist enabled. Fixes #2021 (@chrisr0880, @sahdan, @dangoscomb and @robertcheramy) - Fixed pre_logout for BDCOM switches - Fix 'wpa passphrase' hashed secret for SonicOS devices with built-in wireless #3036 (@lazynooblet) +- Fix potential busy wait when retries and/or next_adds_job is enabled (@gs-kamnas) ## [0.29.1 - 2023-04-24] diff --git a/lib/oxidized/jobs.rb b/lib/oxidized/jobs.rb index 6ea3baac8..c1008e8d2 100644 --- a/lib/oxidized/jobs.rb +++ b/lib/oxidized/jobs.rb @@ -44,6 +44,14 @@ def new_count @want = @max if @want > @max end + def increment + # Increments the job count if safe to do so, which means: + # a) less threads running than the total amount of nodes + # b) we want less than the max specified number of threads + + want = [(@want + 1), @nodes.size, @max].min + end + def work # if a) we want less or same amount of threads as we now running # and b) we want less threads running than the total amount of nodes @@ -51,7 +59,9 @@ def work # then we want one more thread (rationale is to fix hanging thread causing HOLB) return unless @want <= size && @want < @nodes.size - @want += 1 if (Time.now.utc - @last) > MAX_INTER_JOB_GAP + return unless @want <= size + + increment if (Time.now.utc - @last) > MAX_INTER_JOB_GAP end end end diff --git a/lib/oxidized/nodes.rb b/lib/oxidized/nodes.rb index 09fae5ad3..4c41972cf 100644 --- a/lib/oxidized/nodes.rb +++ b/lib/oxidized/nodes.rb @@ -80,7 +80,7 @@ def next(node, opt = {}) # set last job to nil so that the node is picked for immediate update n.last = nil put n - jobs.want += 1 if Oxidized.config.next_adds_job? + jobs.increment if Oxidized.config.next_adds_job? end end alias top next