Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent writing of completely empty files #323

Open
ntessore opened this issue Jun 1, 2021 · 20 comments
Open

prevent writing of completely empty files #323

ntessore opened this issue Jun 1, 2021 · 20 comments
Labels
feature request An issue requesting a new feature

Comments

@ntessore
Copy link
Contributor

ntessore commented Jun 1, 2021

Creating a new FITS file with mode rw and not doing anything with the file results in a corrupt FITS file:

>>> from fitsio import FITS
>>> fits = FITS('test.fits', mode='rw')
>>> fits.close()
>>> fits = FITS('test.fits', mode='rw')
Traceback (most recent call last):
...
This does not look like a FITS file.
@esheldon
Copy link
Owner

esheldon commented Jun 1, 2021

You need to write something. To create an empy first HDU, write None

import fitsio

fname = 'test.fits'
with fitsio.FITS(fname, mode='rw', clobber=True) as fits:
    fits.write(None)

with fitsio.FITS(fname) as fits:
    assert len(fits) > 0
    assert not fits[0].has_data()

@ntessore
Copy link
Contributor Author

ntessore commented Jun 1, 2021

While writing an empty PHDU at the beginning works around the issue, I still believe that opening a file and not doing anything to the object should either not create the file, or should create a file that is readable (e.g. add the empty PHDU automatically). I found it odd that one can so easily engineer a file using fitsio that fitsio itself then cannot read.

@esheldon esheldon added the feature request An issue requesting a new feature label Jun 1, 2021
@esheldon
Copy link
Owner

esheldon commented Jun 1, 2021

I converted this to a feature request

@ntessore ntessore changed the title new empty FITS files cannot be read prevent writing of completely empty files Jun 1, 2021
@esheldon
Copy link
Owner

esheldon commented Jun 1, 2021

There is something to discuss about this.

If we implemented this feature request then it would not be possible to update the file at a later date to contain data in the first HDU

There are probably two ways to do deal with this so it makes sens

  1. Automatically write an empty extension if the user never wrote anything.
  2. raise an exception if the user never wrote anything

My preference would be 2, but I'm happy to discuss

@ntessore
Copy link
Contributor Author

ntessore commented Jun 1, 2021

Is it not feasible to simply unlink the new file again if nothing was written to it? Using the open handle so there is no issue with something else having written over the file in the meantime

@esheldon
Copy link
Owner

esheldon commented Jun 1, 2021

That would not match expectations for python file opening; if you use open(fname, 'w') it will create the file

@beckermr
Copy link
Collaborator

beckermr commented Jun 1, 2021

I think we have to write the file with zero bytes and then on opening either delete it if it is zero length for rw or build a read-only emulation of an empty fits file in the python layers.

@ntessore
Copy link
Contributor Author

ntessore commented Jun 1, 2021

In which case I would say option 1 with the empty PHDU written just before closing an empty file. The user can never add data into the first HDU of an invalid file anyway.

@beckermr
Copy link
Collaborator

beckermr commented Jun 1, 2021

We could also raise for raise for read attempts on empty files

@esheldon
Copy link
Owner

esheldon commented Jun 1, 2021

The code does raise OSError on trying to open an empty file with mode read only

@esheldon
Copy link
Owner

esheldon commented Jun 1, 2021

I will argue that If you open a file with mode rw and don't write anything it is a bug

@beckermr
Copy link
Collaborator

beckermr commented Jun 1, 2021

Fair point on a that being a bug.

I guess the question is whether we should or should not match the python vs fits file semantics.

We should decide on that and then make the behavior in the code match.

@esheldon
Copy link
Owner

esheldon commented Jun 2, 2021

Note this is underlying cfitsio behavior not fitsio behavior; the same thing happens if you open a file with cfitsio and don't write anything, it leaves an empty file and it can't read it!

@ntessore
Copy link
Contributor Author

ntessore commented Jun 2, 2021

In that case I’m happy to close this as wontfix. I still find it odd, but it’s probably not worth it to work around cfitsio quirks.

@beckermr
Copy link
Collaborator

beckermr commented Jun 2, 2021

Ok. I agree then that adding an explicit error that is raised if an empty file is closed, while not pythonic, is the best way to match cfitsio in a user friendly way.

If we'd rather match python semantics, then we'd need to put in special handling for empty files.

@esheldon
Copy link
Owner

esheldon commented Jun 2, 2021

If I understand your meaning, I think that what we do now matches python semantics: If you open a file for write, and don't write anything, it just leaves an empty file, like open()

This matches cfitsio too

But this is structured data and that is invalid FITS. So we could take a stance and say "no you can't write an invalid file"

This will be a breaking change for someone out there.

@beckermr
Copy link
Collaborator

beckermr commented Jun 2, 2021

It is not fully python semantics currently IIUIC. Opening an empty file for rw is valid for a generic python file.

@beckermr
Copy link
Collaborator

beckermr commented Jun 2, 2021

Ack you are right that if we error on write this won't match cfitsio.

@ntessore
Copy link
Contributor Author

ntessore commented Jun 2, 2021

As I said, I didn't realise that cfitsio has this same behaviour, so I'm happy with no change here.

If change is desirable, then I think you only need to decide whether opening a FITS for writing and closing it again should be a NOP or create an empty FITS file. It it means the former then you can safely remove empty files, and If it means the latter then it should write the None extension of an empty FITS file, since a completely empty file does not map to anything in the FITS universe.

@esheldon
Copy link
Owner

esheldon commented Jun 2, 2021

detecting the empty file would require a bit of care, since cfitsio buffers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request An issue requesting a new feature
Projects
None yet
Development

No branches or pull requests

3 participants