-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-38546: Implement new CalibrateImageTask #802
Conversation
46d8a97
to
4620202
Compare
@@ -417,6 +417,9 @@ def run(self, exposure, sourceCat, expId=0): | |||
flux0 = 10**(0.4*r.zp) # Flux of mag=0 star | |||
flux0err = 0.4*math.log(10)*flux0*r.sigma # Error in flux0 | |||
photoCalib = makePhotoCalibFromCalibZeroPoint(flux0, flux0err) | |||
self.log.info("Photometric calibration factor (nJy/ADU): %f +/- %f", | |||
photoCalib.getCalibrationMean(), | |||
photoCalib.getCalibrationErr()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you roll this into the previous log message or set it to DEBUG? I don't think we usually want this task to be very verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or use self.log.verbose
if this is not really a debug message but also not something you want people to see all the time. Pipeline batch execution runs at VERBOSE level rather than INFO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PhotoCalTask is very much not chatty:
- Are color terms being applied (and if so, from where)?
- Report result as a magnitude zero point.
- Report result as calibration factor (this addition).
We still don't have clear guidance on what to output at what level, but two lines summarizing what the result is seems appropriate here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a very strong opinion. I just think people tend to look at task-level log messages in order to:
- help diagnose a problem
- get a quick sense of status or timing (there are better ways for both, but probably not easier ways)
and I don't see this one as helping with either.
|
||
astrometry_ref_cat = connectionTypes.PrerequisiteInput( | ||
doc="Reference catalog to use for astrometric calibration.", | ||
name="gaia_dr2_20200414", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep an eye on Clare's Gaia DR3 ticket in case it lands before this does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we'll have dr3 distributed across our various test datasets nor fully vetted soon enough. It's an easy change here once that happens.
storageClass="ExposureF", | ||
dimensions=("instrument", "visit", "detector"), | ||
) | ||
# TODO: persist a parquet version of this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this ticket, I assume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure: we'll need the SourceCatalog versions to store the footprints, but Eli's your vision is to only store the footprints, none of the other results in that SourceCatalog. I'm not sure that's actually practical here.
In order to get this merged, I guess I'll put this on the "do it later" list, but it should probably be near the top of that list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed as DM-40061.
stars = connectionTypes.Output( | ||
doc="Catalog of unresolved sources detected on the calibrated exposure; " | ||
"includes source footprints.", | ||
name="initial_stars", # TODO: what to name this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO should definitely be done on this ticket, and that includes thinking about what to name it now if we want the long-term name to be the parquet variant, because adding new dataset types is a gazillion times easier than changing the meaning of old ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial_stars_footprints_detector
, in keeping with the "this is where you get the source footprints" thing, and also the "append _detector
so full-visit things don't need a suffix"? And then the parquet version would be initial_stars_detector
?
If instead we called this initial_stars_detector
, what would we call the parquet version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty happy with initial_stars_footprints_detector
for FITS and initial_stars_detector
for parquet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the _detector
on the AFW tables? They're never aggregated, as far as I'm aware. If we want _detector
on all detector-level catalogs, no matter their format, that's fine.
tests/test_calibrateImage.py
Outdated
# We don't have many sources, so have to fit simpler models. | ||
self.config.psf_detection.background.approxOrderX = 1 | ||
self.config.star_detection.background.approxOrderX = 1 | ||
# TODO: While debugging DM-32701, we're using PCA instead of psfex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO a relic now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, no. DM-32701 fixed the "junk PSF produced" problem into a "raise error", but we still have the same problem of it thinking we don't have enough PSF sources (even when we should for the given fitting order). We don't have a followup ticket to explore that further, so I'm not sure what to say here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is just about the test, not any real data, let's just remove the TODO.
PSFEx seeming inconsistent about how many stars it needs is something I'd consider a low-priority problem (to the point where I'm not sure a ticket or TODO is merited), because it's easy to imagine that happening because our thinking of its algorithm is an oversimplification and it genuinely has more degrees of freedom.
If the problem is that PSFEx just isn't robust enough on some important class of data, that's different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem is that we don't know which of those cases it is. This is basically the problem described on DM-40001. In this case, we can just use PCA since the tests don't really care, but I think we probably should investigate psfex's failure modes more fully here.
tests/test_calibrateImage.py
Outdated
idx, _, _ = fitted.match_to_catalog_sky(truth) | ||
# TODO: because the input variance image does not include contributions | ||
# from the sources, the measured instFluxes do not reflect the greater | ||
# uncertainty for faint sources; we can't use fluxErr as a bound on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean "greater uncertainty for bright sources"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reworded it.
4620202
to
43c1c4e
Compare
It's worth logging this in addition to the zero point logged above, as this is the value that is saved in the PhotoCalib object.
gen3 pipetasks call setRefObjLoader; users in notebooks and tests may still want to use the __init__ interface though.
This is necessary for writing the denormalized match catalog.
2fd8931
to
da196b1
Compare
I've also added in an "install simple PSF after initial PSF determination", per conversation on slack last week. |
@@ -417,6 +417,9 @@ def run(self, exposure, sourceCat, expId=0): | |||
flux0 = 10**(0.4*r.zp) # Flux of mag=0 star | |||
flux0err = 0.4*math.log(10)*flux0*r.sigma # Error in flux0 | |||
photoCalib = makePhotoCalibFromCalibZeroPoint(flux0, flux0err) | |||
self.log.info("Photometric calibration factor (nJy/ADU): %f +/- %f", | |||
photoCalib.getCalibrationMean(), | |||
photoCalib.getCalibrationErr()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a very strong opinion. I just think people tend to look at task-level log messages in order to:
- help diagnose a problem
- get a quick sense of status or timing (there are better ways for both, but probably not easier ways)
and I don't see this one as helping with either.
tests/test_calibrateImage.py
Outdated
# We don't have many sources, so have to fit simpler models. | ||
self.config.psf_detection.background.approxOrderX = 1 | ||
self.config.star_detection.background.approxOrderX = 1 | ||
# TODO: While debugging DM-32701, we're using PCA instead of psfex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is just about the test, not any real data, let's just remove the TODO.
PSFEx seeming inconsistent about how many stars it needs is something I'd consider a low-priority problem (to the point where I'm not sure a ticket or TODO is merited), because it's easy to imagine that happening because our thinking of its algorithm is an oversimplification and it genuinely has more degrees of freedom.
If the problem is that PSFEx just isn't robust enough on some important class of data, that's different.
stars = connectionTypes.Output( | ||
doc="Catalog of unresolved sources detected on the calibrated exposure; " | ||
"includes source footprints.", | ||
name="initial_stars", # TODO: what to name this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty happy with initial_stars_footprints_detector
for FITS and initial_stars_detector
for parquet.
This task will replace CharacterizeImageTask and CalibrateTask.
ff3c5f9
to
db7a6e0
Compare
No description provided.