all pending jobs killed after Flux update #406

grondo · 2024-01-08T19:03:38Z

After a flux-core update to v0.58.0, all the pending jobs on fluke were killed with an exception from the mf_priority plugin:
e.g.

[root@fluke108:~]# flux job eventlog -H f2m1t75LN7vo
[Jan05 18:00] submit userid=61494 urgency=16 flags=0 version=1
[  +0.018063] jobspec-update attributes.system.bank="guests"
[  +0.018097] jobspec-update attributes.system.bank="guests"
[  +0.018144] validate
[  +0.034171] depend
[  +0.034206] priority priority=625
[Jan08 08:20] flux-restart
[  +0.000034] exception type="mf_priority" severity=0 note="not a member of guests"
[  +0.000062] priority priority=16
[  +0.000083] clean

The user, testqe, is in the guests bank though:

$ flux account view-bank --users guests | grep testqe
testqe            61494             guests            1                 175899119.293700070.00625           100

The text was updated successfully, but these errors were encountered:

cmoussa1 · 2024-01-08T19:30:20Z

Without trying to construct a reproducer (yet), I believe this type of exception message is raised when a job's user/bank information is being updated in job.state.priority and the plugin cannot find a valid user/bank entry for the user that this job is submitted under:

flux-accounting/src/plugins/mf_priority.cpp

Lines 676 to 679 in 1a84215

    
           if (bank_it == it->second.end ()) { 
        
               flux_jobtap_raise_exception (p, FLUX_JOBTAP_CURRENT_JOB, 
        
                                            "mf_priority", 0, 
        
                                            "not a member of %s", bank);

Looking at the timestamps of the eventlog, it looks like the job was:

submitted successfully (and received a priority)
Flux was restarted
the priority plugin was loaded
jobs were reprioritized before the plugin received any flux-accounting data, so it rejected this job and presumably all pending jobs.

I'll see if I can reproduce this behavior.

cmoussa1 · 2024-01-08T20:02:22Z

OK, I actually think I was able to reproduce this without having to restart Flux and instead just unloading/reloading the plugin. If I have a number of jobs in SCHED state (i.e they've received a priority and are waiting to run) and I reload the plugin without updating it with flux-accounting information and call reprioritize on all jobs:

(flux.Flux().rpc("job-manager.mf_priority.reprioritize"))

they will have an exception raised on them saying that the plugin cannot find a valid user/bank entry for those previously held jobs.

Is there a process in restarting Flux where jobs could be reprioritized? My thinking is that might have been what caused this.

In any case, the plugin should probably handle this case more gracefully so a bunch of users' jobs don't get canceled if Flux gets restarted.

I'll need to test this, but off the top of my head, I think I can include a check in the callback for job.state.priority that checks the plugin's internal map for data before deciding what to do with the job going through reprioritization.

If the plugin's internal map is empty (i.e it is waiting for flux-accounting information), it can continue to hold the job in PRIORITY until it loads some information. This would be similar to the behavior in the callback for job.validate.

grondo · 2024-01-08T20:15:10Z

Nice job debugging @cmoussa1!

All jobs are prioritized any time a jobtap plugin is loaded, so I had assumed this would happen after mf_priority.so is loaded.

Problem: the priority plugin will raise an exception on a job if it is held in SCHED state while the plugin is reloaded (or Flux is restarted) and jobs are reprioritized without first loading flux-accounting data to this plugin. This behavior is not graceful and we should instead continue to hold a job in PRIORITY while the plugin waits to receive flux-accounting data. Add a check of the plugin's internal map to see if we are still waiting on flux-accounting data to be loaded in; if so, continue to hold the job while we wait for data. Add a sharness test that reproduces the issue raised in flux-framework#406 and ensure that jobs continue to be held after a reprioritization without loading flux-accounting data to the priority plugin.

grondo · 2024-01-09T16:12:16Z

Closed by #407?

cmoussa1 · 2024-01-09T16:35:10Z

Ah, yes, should be closed by #407 - sorry that I didn't close this yesterday. Closing now

cmoussa1 added the bug Something isn't working label Jan 8, 2024

cmoussa1 self-assigned this Jan 8, 2024

cmoussa1 mentioned this issue Jan 8, 2024

plugin: keep jobs in PRIORITY after reprioritization #407

Merged

grondo mentioned this issue Jan 10, 2024

priority plugin posting identical jobspec-update event twice flux-framework/flux-core#5671

Closed

cmoussa1 closed this as completed Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

all pending jobs killed after Flux update #406

all pending jobs killed after Flux update #406

grondo commented Jan 8, 2024

cmoussa1 commented Jan 8, 2024

cmoussa1 commented Jan 8, 2024

grondo commented Jan 8, 2024

grondo commented Jan 9, 2024

cmoussa1 commented Jan 9, 2024

all pending jobs killed after Flux update #406

all pending jobs killed after Flux update #406

Comments

grondo commented Jan 8, 2024

cmoussa1 commented Jan 8, 2024

cmoussa1 commented Jan 8, 2024

grondo commented Jan 8, 2024

grondo commented Jan 9, 2024

cmoussa1 commented Jan 9, 2024