-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
a0py (climatedt-vm) rebuild of the database #72
Comments
In GitLab by @dbeltrankyl on Mar 15, 2024, 16:17 Hello @mcastril I recovered part of the database on my laptop. I used the TOTAL_STAT FILE last row and the .cmd on the remote ( due autosubmit inspect overwriting the local ones ) You can visualize it with sqlitebrowser It is missing some fields like children and job_id Some jobs didn't have a finished STAT file ( this means that AS was not aware that the job was finished, I guess due to the set status issue ) Python script: |
In GitLab by @kinow on Mar 18, 2024, 14:04 This script looks useful, Dani! Maybe we can include this in GitLab, somewhere like
The configuration should contain some of that, like ncpus, wallclock, qos, but it may have changed, right? Would it be possible to locate the missing information in some other logs? Like parsing the cmd file and getting the #SBATCH parameters? And get the start/finish/status/etc from the job file? And great job!!! |
In GitLab by @dbeltrankyl on Mar 18, 2024, 14:15 Hello @kinow , thanks!
Yes, I parsed the remote .cmds due that. If you check the db, you can see that the sim have a different number of nodes ncpus. IT is not enough with the .cmd on the local machine because the inspect may override them
My issue is with the TOTAL_STAT files as this is filled by Autosubmit when it acknowledges that the job has ready, started, submitted, finished, some of them doesn't have the finished timestamp. I guess it happened due the issues with all to waiting jobs. |
In GitLab by @kinow on Mar 18, 2024, 14:17
Could we use one of those... |
In GitLab by @dbeltrankyl on Mar 18, 2024, 16:25 I've downloaded the remote _STAT and I fixed some of stuff missing or corrected timestamps like 19700101020000. Also I realized that I stored the datetime when it should be the timestamp Edit: still missing stuff:
|
In GitLab by @mcastril on Mar 18, 2024, 20:11 Thank you Dani, I was reviewing the data and I was going to tell you about the timestamps. The numbers make sense for me, but I don't see the OPA and APP jobs. I think they were missing, too |
In GitLab by @dbeltrankyl on Mar 19, 2024, 09:26 Ah, I only look at SIM ones, I have an issue with the OPA files and APP files. The permissions are weird:
I can't download some of them. Can someone add R (or change the group to autosubmit_users) for others in the cmd, TOTAL_STAT, and STAT files, Or all files under |
In GitLab by @kinow on Mar 19, 2024, 09:31
@dbeltrankyl, some days ago Kai mentioned |
In GitLab by @ainagaya on Mar 19, 2024, 09:37 On it! |
In GitLab by @ainagaya on Mar 19, 2024, 09:39
|
In GitLab by @dbeltrankyl on Mar 19, 2024, 09:40 Yes, thanks! |
In GitLab by @ainagaya on Mar 19, 2024, 09:43 Done :) (in the end I had to do |
In GitLab by @dbeltrankyl on Mar 19, 2024, 09:45 Yes, thanks Aina. Now I can download them |
In GitLab by @ainagaya on Mar 21, 2024, 09:35 Hi, I got this message from Kai:
Is this related to this issue? Sorry I'm a bit lost. |
In GitLab by @dbeltrankyl on Mar 21, 2024, 09:42 I think so, Are you using v4.1.2 yet? v4.1.2 should be able to store the running jobs ( well if they start and finishes with 4.1.2) if not I need to add this week sim_chunks to the .fixed db |
In GitLab by @ainagaya on Mar 21, 2024, 09:51 a0py still uses dev-8 version. Do you think that is safe to migrate? |
In GitLab by @dbeltrankyl on Mar 21, 2024, 11:47 I think so, we talked about this in the as meeting and Miguel mention to discuss it in the afternoon meeting |
In GitLab by @mcastril on Apr 4, 2024, 14:43 Hi Dani, what's the current DDBB status? |
In GitLab by @dbeltrankyl on Apr 4, 2024, 15:04 Hello, I did not touch the original one, but I have a local copy with opa,app and sim data I can apply the fix at any moment ( I need to download the last info) is the production experiment running ( or is it finished? ) with 4.1.2+? If so, I can apply the fix tomorrow |
In GitLab by @dbeltrankyl on Apr 4, 2024, 15:11 I see that it is still using the dev one, in that case, I can apply the fix, but we will have the same issue for newer jobs. If that is okay I can upload it anyway |
In GitLab by @mcastril on Apr 24, 2024, 12:52 a0py is using v4.1.2 since few weeks ago |
In GitLab by @manuel-g-castro on Mar 14, 2024, 10:50
@dbeltrankyl and @LuiggiTenorioK to figure out how to repopulate the database of the production run of ifs+nemo workflow on LUMI,
a0py
.All of this to get the sypd, chpy, etc.
fyi @ainagaya
The text was updated successfully, but these errors were encountered: