-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pc_align: improved sampling strategy, before/after stats reporting, and doc guidance on evaluating improvement #423
Comments
Tagging @ShashankBice, @rhugonnet and @adehecq for review, input and other ideas on potential improvements for pc_align. |
Hi @dshean, @oleg-alexandrov, Hmmm, it's hard to generalize best practices for any case (hence the research on DEM coregistration I am doing right now). But I have thought a lot about this the past few years, and we've been in the process of adding similar things to xDEM. Here are the 3 main "generic" ideas that would work for any scenario:
A practical example: I recently reviewed a paper trying to correct biases due to jitter or other spatial noises with terrain segmentation + statistical fits on the residuals. The method worked quite well with a lot of static surfaces (visually on the maps of residuals) but, when wanting to put a quantitative estimation of the improvement, the STD/NMAD decrease was barely visible at ~10%. However, when the authors used the mid- and long-range correlation in residuals (that I proposed to them, based on discussions in my 2022 paper on how to use correlation metrics to evaluate corrections), the improvement was more than 60%, sometimes close to 100%! 🥳 It's basically just comparing the sill (correlated variance, Y axis) of the mid and long ranges found for the variogram model of the residuals, that can be derived like this: https://xdem.readthedocs.io/en/stable/basic_examples/plot_infer_spatial_correlation.html#sphx-glr-basic-examples-plot-infer-spatial-correlation-py.
Hope this helps! |
Hi @dshean, |
Thanks for all of your thoughts. Will follow up in more detail later and discuss options with Oleg. In the meantime, I created a little notebook to ingest and visualize the current pc_align output: https://github.com/dshean/demcoreg/blob/master/demcoreg/pc_align_output.ipynb It's not the best example for a number of reasons, but a start. |
Is your feature request related to a problem? Please describe.
pc_align
returns a lot of output to stdout, including some key metrics for evaluation of the transformation quality. New users don't know how to interpret all of this, or how to evaluate whether the final transform actually improved the alignment between their input datasets. Many just run the tool, and proceed with analysis, even though sometimes the transformation made the alignment between their datasets worse.There is some limited information on evaluation in the current doc, but we should offer improved guidelines or recommendations for evaluation of the results.
https://stereopipeline.readthedocs.io/en/latest/tools/pc_align.html#interpreting-the-transform
https://stereopipeline.readthedocs.io/en/latest/tools/pc_align.html#error-metrics-and-outliers
We can also use more sophisticated sampling approaches to validate the improvement of the transformation.
Describe the solution you'd like
pc_align
should report statistics for the "improvement" beyond just reporting the initial and final residuals.and
There should be final lines of output summarizing stats on the difference between input and output residuals, computed on a point-by-point basis, and perhaps differences between the summary statistics...
We typically look at the difference in the median (50%) "before" and "after" numbers, plus the difference in the spread (so "84% minus 16% before" and "84% minus 16% after") of the distributions to evaluate improvement. These two numbers could be used as primary stats for success/improvement. pc_align should compute and displace the spread before and after.
I recommend that we change the terms "error percentile of smallest errors (meters)" and "mean of smallest errors (meters)". I realize pc_align throws out 25%, which is why "smallest errors" is included in these terms, but I think we can be more descriptive. Really, we're talking about "point distance residuals", not necessarily "errors", as some of the residuals could be due to real changes in some parts of the surface (e.g., glacier melt, vegetation change).
I think we should report stats for the "inliers" used during the "calibration" as well as the full sample of difference values. I realize this why two lines of output are provided, but I think we can improve how this is reported so it is easier for users to understand.
Personally, I would like to see a more sophisticated sampling approach that isolates random samples for calibration and validation. One way to do this would be to remove the initial 25% outliers, and then from the inliers, use a random subset (say, 80%) for the calibration and a random subset (say, 20%) for validation to independently check the result. Right now by default, we are using the same set of points for both calibration and validation, unless the user withholds samples before calling pc_align and then does their own validation independently of the tool.
We should at least include some newlines in pc_align output for improved readability, but I think it would be best to report these "improvement metrics" separately from (after) the main stdout stream (which includes runtimes, the transformation and other stuff that can be overwhelming for new users). Right now the numbrers that matter are buried. Basically, make it is easier for people to easily see relevant information and determine whether things worked.
The documentation for pc_align should have a section dedicated to interpreting the stdout (beyond just describing the metrics and recommending people visualize the output). Right now there is only this...
As mentioned above, this is not what we typically use. I am open to other suggestions here on the best stats to use. Really, it might be best to compute signed (rather than absolute) residuals along local "down" direction (or normal to ellipsoid), as the absolute errors will potentially miss skewed distributions. This could be done as a final step for reporting, after minimizing absolute distances.
I think the doc should also mention how to review the observed translation magnitude and evaluate whether it is appropriate given the expected geolocation accuracy of the two inputs. For example, if aligning a WV DEM with expected horizontal/vertical geolocation accuracy of ~3-5 m CE90/LE90 and ICESat-2 points with expected horizontal/vertical geolocation accuracy of ~3/0.1 m, the combined translation magnitude should be <10 m. If the resulting magnitude is 200 m, then something went wrong, and the output should not be used for analysis.
Describe alternatives you've considered
We currently do this type of evaluation with custom scripts that ingest the csv files and/or pc_align output log to compute/extract relevant numbers and plot with Python scripts. Seems much better to have pc_align report this directly.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: