Session seems to hang - how to troubleshoot? #126
-
So if I start a session on a SLURM-enabled instance, I can see that the files import successfully. Then, after import, I can see the jobs spawn for the motion correction and CTF correction. In the NextPYP UI, under "jobs" I see: "Name: spr_session, Size: 5, Status: Running". However, if I click the "Logs" button under that line, there are no logs available to run. I did snoop around the shared directory and under 'nextpyp/shared/sessions/fVpcv8myM877fWdg/Test-fVjFIJrvWTxl2BMJ/Test_6-QLhZfIwQ/log' I can see logs for what look like motion correction and CTF correction. From what I can tell from those outputs, they ran successfully. However, the listed "spr_session" job never ends and no micrographs ever appear in the web UI. How do I go about troubleshooting this issue? Which log files should I look for? (There's quite a few). Thanks EDIT: Btw, everything with this data seems to work fine in non-Session mode. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 2 replies
-
The |
Beta Was this translation helpful? Give feedback.
-
OK further investigation, the most obvious error I've found says:
I'm guessing this means that the aligned micrograph is not being written successfully. I've attached the log, and will keep investigating what the cause might be. |
Beta Was this translation helpful? Give feedback.
-
A bit more information, here are the contents of the scratch dir: When I try to download and examine the .webp file (which I assume should be the preview of the aligned frames) it's just black. Weirdly, ctffind4.mrc does seem to exist, but, it appears to be the same size as the gain reference, which makes me think it's just the gain. |
Beta Was this translation helpful? Give feedback.
-
Seems to be related to linking, instead of copying. When linking files, nextPYP hangs, while when copying, things run OK. |
Beta Was this translation helpful? Give feedback.
Seems to be related to linking, instead of copying. When linking files, nextPYP hangs, while when copying, things run OK.