Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plugin: keep jobs in PRIORITY after reprioritization #407

Merged
merged 1 commit into from
Jan 8, 2024

Conversation

cmoussa1
Copy link
Member

@cmoussa1 cmoussa1 commented Jan 8, 2024

Problem

As noted in #406, the priority plugin will raise an exception on a job if it is held in SCHED state while the plugin is reloaded (or Flux is restarted) and jobs are reprioritized without first loading flux-accounting data to this plugin. This behavior is not graceful and we should instead continue to hold a job in PRIORITY while the plugin waits to receive flux-accounting data.


This PR adds a check of the plugin's internal map to see if we are still waiting on flux-accounting data to be loaded in; if so, it will continue to hold the job while we wait for data.

I've also added a sharness test that reproduces the issue raised in #406 and ensures that jobs continue to be held after a reprioritization without loading flux-accounting data to the priority plugin.

Problem: the priority plugin will raise an exception on a job if it is
held in SCHED state while the plugin is reloaded (or Flux is restarted)
and jobs are reprioritized without first loading flux-accounting data to
this plugin. This behavior is not graceful and we should instead
continue to hold a job in PRIORITY while the plugin waits to receive
flux-accounting data.

Add a check of the plugin's internal map to see if we are still waiting
on flux-accounting data to be loaded in; if so, continue to hold the job
while we wait for data.

Add a sharness test that reproduces the issue raised in flux-framework#406 and ensure
that jobs continue to be held after a reprioritization without loading
flux-accounting data to the priority plugin.
@cmoussa1 cmoussa1 added bug-fix A proposal for something that isn't working plugin related to the multi-factor priority plugin labels Jan 8, 2024
@cmoussa1 cmoussa1 marked this pull request as ready for review January 8, 2024 21:37
@cmoussa1 cmoussa1 requested a review from grondo January 8, 2024 21:37
Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM @cmoussa1!

I wonder if we should generate a release with this fix to go out with the next TOSS release?

@cmoussa1
Copy link
Member Author

cmoussa1 commented Jan 8, 2024

Thanks @grondo! Sure, I can generate a release today and have a ticket submitted by COB tomorrow so that it can go in the next TOSS release. Will work on this now 👍 setting MWP here

Copy link

codecov bot commented Jan 8, 2024

Codecov Report

Merging #407 (aa3d928) into master (85ce780) will not change coverage.
Report is 1 commits behind head on master.
The diff coverage is 100.00%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #407   +/-   ##
=======================================
  Coverage   81.00%   81.00%           
=======================================
  Files          20       20           
  Lines        1437     1437           
=======================================
  Hits         1164     1164           
  Misses        273      273           
Files Coverage Δ
src/plugins/mf_priority.cpp 78.72% <100.00%> (ø)

@mergify mergify bot merged commit 53cac9d into flux-framework:master Jan 8, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix A proposal for something that isn't working merge-when-passing plugin related to the multi-factor priority plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants