-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Photometric Interpretation of the dataset is YBG_FULL_422 [...] You may need to change the Photometric Interpretation to the correct value #240
Comments
I'm not sure - I would test loading / interacting with the dataset first and seeing if it's an issue with pydicom or within deid. If it's within deid, then we should figure out what changes are being done to lead to the issue. My suggestion:
pinging @wetzelj for thoughts! |
I'm fairly new to pydicom at large, I will try loading and saving with only pydicom and see what my luck is. You're suggesting just the act of loading the files and writing them back out with minimal to no changes, correct? |
Yes! And then ping the pydicom maintainers in case they have an idea. Probably you'll wind up back here, and in which case I'll need a dummy dataset to reproduce your error and then be able to work on. And no worries about being new to pydicom - welcome! |
Unfortunately I haven't seen this issue - and don't think I've dealt with YBR_FULL_422 images at all. That said, I noticed that in clean.py deid is switching the PhotometricInterpretation to RGB in order to obtain the pixel_array. This makes me wonder if it is something that we're doing incorrectly when masking the pixels on a YBR_FULL_422 image - and the error is being produced by pydicom on the write file. Ultimately, I think we're going to need a sample dataset in order to debug fully. |
+1! I hope you can provide us with a dataset to reproduce @johnavitable. I'm suspicious of the same. |
So, in an effort to do a sanity check, I ran over the dicom cookies dataset and actually encountered the same issue, it would appear this is not specific to any datasets of my own. The dicom cookies actually are all YBR_FULL_422 as well. I've tried a variety of versions of python, though the latest I tried with dicom cookies was 3.10.7. In running through dicom cookies, it throws the error in image1.dcm and then the rest hit the blacklist but are also rendered unviewable. I'm not sure if maybe there's something that changed with numpy or with a new version of python? I was trying to spin up the docker container but based on the Dockerfile, it didn't look like it was doing much that I was missing. I'm on python 3.10.7. Not sure what other info you may need given its present in the sample data. Below is the code that I'm using to run the anonymization, I'm no expert, but I think this should be doing the trick:
|
Thanks @javitab! I'll add this to my list of TODO but it's quite chonky at the moment so I might not get to testing it out immediately. Thanks for figuring out a reproducing case! We did recently have changes to the cleaner (the coordinate system was off I think) so I wouldn't be surprised if there is a bug. |
Thanks, that sounds plausible to me. I think I saw an instance or two where there was an RGB image that was almost viewable, but the "green" layer was skewed, if that makes sense. |
Okay got a chance to run this, and on the dicom cookies as you suggested! I don't think I see the error block? $ python test.py
Session ID: RbEPDAqbQyD8usnKtpDVb-8pCrExTJ6n5CbmZ_wCpvU
Please put files in input folder as named above (ENTER)
LOG Discovering DICOM files
###DICOM File: image5.dcm:cookie-47 - falling disk - M
YBR_FULL_422
###DICOM File: image3.dcm:cookie-47 - still salad - F
YBR_FULL_422
###DICOM File: image6.dcm:cookie-47 - noisy feather - M
YBR_FULL_422
###DICOM File: image7.dcm:cookie-47 - frosty paper - F
YBR_FULL_422
###DICOM File: image2.dcm:cookie-47 - billowing mode - F
YBR_FULL_422
###DICOM File: image1.dcm:cookie-47 - nameless waterfall - F
YBR_FULL_422
###DICOM File: image4.dcm:cookie-47 - flat glade - M
YBR_FULL_422
WARNING Problem loading stock.deid.dicom, skipping.
WARNING Problem loading stock.deid.dicom, skipping.
### Reading DICOM File 1/7: image5.dcm:cookie-47 - falling disk - M
{'flagged': True,
'results': [{'coordinates': [],
'group': 'blacklist',
'reason': ' ImageType missing or ImageType empty '}]}
Scrubbing data/RbEPDAqbQyD8usnKtpDVb-8pCrExTJ6n5CbmZ_wCpvU/input/image5.dcm.
/home/vanessa/Desktop/Code/deid/env/lib/python3.9/site-packages/pydicom/pixel_data_handlers/numpy_handler.py:341: UserWarning: The Photometric Interpretation of the dataset is YBR_FULL_422, however the length of the pixel data (9437184 bytes) is a third larger than expected (6291456 bytes) which indicates that this may be incorrect. You may need to change the Photometric Interpretation to the correct value.
warnings.warn(msg)
### Reading DICOM File 2/7: image3.dcm:cookie-47 - still salad - F
{'flagged': True,
'results': [{'coordinates': [],
'group': 'blacklist',
'reason': ' ImageType missing or ImageType empty '}]}
Scrubbing data/RbEPDAqbQyD8usnKtpDVb-8pCrExTJ6n5CbmZ_wCpvU/input/image3.dcm.
### Reading DICOM File 3/7: image6.dcm:cookie-47 - noisy feather - M
{'flagged': True,
'results': [{'coordinates': [],
'group': 'blacklist',
'reason': ' ImageType missing or ImageType empty '}]}
Scrubbing data/RbEPDAqbQyD8usnKtpDVb-8pCrExTJ6n5CbmZ_wCpvU/input/image6.dcm.
### Reading DICOM File 4/7: image7.dcm:cookie-47 - frosty paper - F
{'flagged': True,
'results': [{'coordinates': [],
'group': 'blacklist',
'reason': ' ImageType missing or ImageType empty '}]}
Scrubbing data/RbEPDAqbQyD8usnKtpDVb-8pCrExTJ6n5CbmZ_wCpvU/input/image7.dcm.
### Reading DICOM File 5/7: image2.dcm:cookie-47 - billowing mode - F
{'flagged': True,
'results': [{'coordinates': [],
'group': 'blacklist',
'reason': ' ImageType missing or ImageType empty '}]}
Scrubbing data/RbEPDAqbQyD8usnKtpDVb-8pCrExTJ6n5CbmZ_wCpvU/input/image2.dcm.
### Reading DICOM File 6/7: image1.dcm:cookie-47 - nameless waterfall - F
{'flagged': True,
'results': [{'coordinates': [],
'group': 'blacklist',
'reason': ' ImageType missing or ImageType empty '}]}
Scrubbing data/RbEPDAqbQyD8usnKtpDVb-8pCrExTJ6n5CbmZ_wCpvU/input/image1.dcm.
### Reading DICOM File 7/7: image4.dcm:cookie-47 - flat glade - M
{'flagged': True,
'results': [{'coordinates': [],
'group': 'blacklist',
'reason': ' ImageType missing or ImageType empty '}]}
Scrubbing data/RbEPDAqbQyD8usnKtpDVb-8pCrExTJ6n5CbmZ_wCpvU/input/image4.dcm.
LOG Getting current identifiers
LOG Removing identifiers
LOG Identifiers removed from 7 files.
LOG Anonymization complete. Files have been written to data/RbEPDAqbQyD8usnKtpDVb-8pCrExTJ6n5CbmZ_wCpvU/output/ But I'm not sure these are the best reproducing cases to test - these were fake images I made, mostly for the header parsing. Do we have a reproducing image that wasn't artificially made by me? |
I've been doing some more testing and have a few discoveries. I've attached some sample data and information from my environment. I have noticed more specifically that, while all of the dicom-cookies samples get destroyed when I try to run through them, for the most part, any files that are a static image are successful, and anything that includes a cine clip, gets destroyed. Further, I have found that the ones that do get destroyed (the cine clips) the file size grows by several orders of magnitude. I haven't included the output from when I run through as it becomes far too large to upload here, but if that is needed, I'm happy to work at sharing that data. Lastly, being the difference between what @vsoch gets after running through the same script and what I get, I noticed in your output that you're on python 3.9. I switched to 3.9 from 3.11, though that unfortunately didn't make any difference. I've also included the output of Thanks again! EDIT: I've also tried running all of this through the docker container with no difference. |
Thanks! I'm not sure what further help I can offer - this isn't my primary area of work for several years and I lack the expertise. If someone that has the expertise wants to take charge of investigating this and opening a PR it would be greatly appreciated. |
Would you by any chance be able to share the output of a pip freeze from where you ran the code earlier? At the very least, I'm confused as to why we get different outcomes on the dicom-cookies dataset. I was looking at a prior issue that's been logged here about RGB cine loops, so I'm wondering if my problem with dicom-cookies might point at anything else. |
The environment above is long gone - but here is my Python: python --version
Python 3.9.12 I use anaconda and basically create a new venv, source it, then |
I think it's almost time for a python update - my pip freeze seems borked :(
|
I'm trying to use DicomCleaner, I do my detect, followed by the clean and save, however, I get the below error message which appears to be coming from pydicom, not sure if this issue should go under there. I find that I am successful with, for example, a demographics page (RGB), however, any of the other dcm files that do indeed have a Photometric Interpretation of YBG_FULL_422 throw the below error and when I try to load them in, for example, Micro Dicom Viewer, it's a bunch of nonsense.
Do I really need to handle this myself or am I missing something? This seems to be happening the same with different manufacturers as well.
C:\git\dicomtools\venv\Lib\site-packages\pydicom\pixel_data_handlers\numpy_handler.py:250: UserWarning: The Photometric Interpretation of the dataset is YBR_FULL_422, however the length of the pixel data (21823488 bytes) is a third larger than expected (14548992 bytes) which indicates that this may be incorrect. You may need to change the Photometric Interpretation to the correct value.
The text was updated successfully, but these errors were encountered: