Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raw files at NERSC with .fits.grid extensions #428

Open
jchiang87 opened this issue Dec 2, 2021 · 3 comments
Open

Raw files at NERSC with .fits.grid extensions #428

jchiang87 opened this issue Dec 2, 2021 · 3 comments

Comments

@jchiang87
Copy link
Contributor

While ingesting the data in /global/cfs/cdirs/lsst/shared/DC2-prod/Run2.2i/sim/y4-wfd/ into a gen3 repo at NERSC, the ingest code encountered files with .fits.grid extensions in /global/cfs/cdirs/lsst/shared/DC2-prod/Run2.2i/sim/y4-wfd/00741720/. These appear to be copies of files in the same directory with .fits extensions. If they are copies, they should be deleted (or at least moved elsewhere) since their presence causes problems for the ingest. I'm about to ingest the y5-wfd data and will report here if I see similar issues.

@heather999
Copy link
Collaborator

Just want to reach out to @airnandez for confirmation that the .fits.grid files are indeed just copies of the .fits. I also believe that is the case. If so, I don't see any reason to retain the .fits.grid files and we can remove them.

@jchiang87
Copy link
Contributor Author

For the record, I didn't see any similar issues with the ingest of the y5-wfd data, so it appears that the y4-wfd/00741720 folder is the only instance of this.

@airnandez
Copy link

I looked at directory y4-wfd/00741720 at CC-IN2P3 and I see 12 files with .fits.grid extension. Their contents is not identical to their .fits counterpart. For instance:

$ cd /sps/lsst/datasets/desc/DC2/Run2.2i/sim/y4-wfd/00741720

$ md5sum lsst_a_741720_R32_S01_u.fits.grid lsst_a_741720_R32_S01_u.fits
bc2c3bbbb71944767a720c838f7c2d53  lsst_a_741720_R32_S01_u.fits.grid
bd0a2fd67126cfd8b55547857d4ccdf1  lsst_a_741720_R32_S01_u.fits

To my knowledge those .grid files were not used at all for DC2. I observe that the .fits files are newer than the ones with .fits.grid extension. If I remember correctly, there were some issues with some simulation jobs of Y4 and Y5 using the grid which required to resubmit failed jobs, which had already produced their outputs and transferred to CC-IN2P3 for permanent storage. The issue was found the simulation was resubmitted either to the grid or executed at NERSC.

I think this issue is related to this Slack conversation.

For the specific issue of ingesting raws into a butler gen3 registry, it is possible to specify *.fits as an argument of the butler ingest-raws ... command to avoid it picking the .grid files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants