Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework LevenbergMarquardt to use the vector function functionality #432

Merged
merged 37 commits into from
Jan 4, 2025

Conversation

kellertuer
Copy link
Member

@kellertuer kellertuer commented Dec 27, 2024

This make the interface for Levenberg-Marquardt a bit more flexible.

🛣️🗺️

  • rework vector functions to also provide a proper jacobian interface
  • store the cost in the nonlinear least squares objective as a VectorGradientFunction
  • rework get_jacobian (formerly “get-gradient-from-jacobian” to “pass through to the new inner vgf” without smoothing
  • check that all new functions are properly documented
  • work on test coverage 📈

The main thing to figure out is the new Jacobian to return in case of smoothing and how to adapt the regularisation parameter $\lambda_k$ for this case. This last point especially I have not yet understood, neither from the ceres documentation linked above or the nearly-the-same phrasing at the end of Section 4.3 in Triggs et al.

@kellertuer
Copy link
Member Author

This was faster than I thought. Tests should be up and running again for the rework.

For LM this is even non-breaking, since what mainly changed is the internal representation of the vector function and its jacobian as a VectorGradientFunction.
This even allows to now provide instead of a jacobian a function returning all gradients or a vector of gradient functions in LM.

Smoothing should be covered as well, though the $\lambda_k$ is probably still a “poor-mans-version” that for most cases should work but is not so much covered by theory.

For general vector functions there is one small (breaking) change, though that was more of a bug I think:
Previously specifying a vector function to be InplaceEvaluation only made the jacobian (and Hessian) mutating versions. The actual vectorial (cost) function was always assumed to be allocating. That was more by accident and also quite inconsistent.
Now also the actual vector function then has to be mutating. I might have to check that get_value! is consistent with this as well.

@kellertuer kellertuer added enhancement Ready-for-Review A label for pull requests that are feature-ready labels Dec 29, 2024
Copy link

codecov bot commented Dec 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.88%. Comparing base (41d852d) to head (d3c6cdf).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master     #432    +/-   ##
========================================
  Coverage   99.87%   99.88%            
========================================
  Files          78       78            
  Lines        8278     8384   +106     
========================================
+ Hits         8268     8374   +106     
  Misses         10       10            

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kellertuer kellertuer changed the title [WIP] Rework LevenbergMarquardt to use the vector function functionality and to be robust Rework LevenbergMarquardt to use the vector function functionality and to be robust Dec 29, 2024
@kellertuer
Copy link
Member Author

THere we are, everything has been tested, LM can now be used with much more diverse inputs of the Jacobian, also as gradients of the components and such, smoothing is available, though not yet 100% sure it converges. Do you, @Affie have a concrete example you wanted to test this on?
The one thing that might not yet fully work (and needs further research actually) is the $\lambda_k$ regularisation parameter in the subproblem. But that is something for a later PR.

@mateuszbaran
Copy link
Member

I've taken a look and I think we need to generalize robustification a bit. Both Ceres and Triggs consider a more general objective, where the smoothing function $\rho_i$ is applied to a group of observations instead of a single observation like in your PR. See for example Eq. (3) in Triggs. I think restricting each group to 1 element makes this extension much less useful because the least squares Jacobians are computed on a per-smoothing block basis. Moreover, it's even a performance regression for the non-smoothing case because cost and Jacobian computations are much harder to vectorize.

I think for simplicity we can assume that each group has the same size but we need to let it be arbitrarily large.

@kellertuer
Copy link
Member Author

I thought about that a bit as well. My main problem is, that this current rework I could do in about a week. It still got quiiiite technical for the vectorial function, but nice to have Jacobians therein now covered nicely.

But for arbitrary block the current structure of the smoothing thingy is useless. that would need a complete rewrite of this current work. I am not sure how that can be done and how much time that would need.

So if this is useless without blocks – we should just abandon smoothies from this PR and only keep the vectorial rework.

@mateuszbaran
Copy link
Member

OK, then I think it's better to do the vectorial rework here and add smoothing when we have time to work on blocks.

@kellertuer
Copy link
Member Author

sounds fair. Sad for the whole factory I built, but then it is what it is, then I wasted my energy on that.

We are also missing the proper lambda_k for smoothing anyways, so then lets remove smoothing again and do that somewhen later. will add a comment on the original issue about this as well.

@mateuszbaran
Copy link
Member

We can finish smoothing later, it's certainly not a wasted effort. Maybe a little bit on the premature integration but finishing this will be a good next project IMO (after LieGroups.jl and GeometricKalman.jl).

@kellertuer
Copy link
Member Author

yeah will try to put it aside on another branch then.

@kellertuer kellertuer changed the title Rework LevenbergMarquardt to use the vector function functionality and to be robust Rework LevenbergMarquardt to use the vector function functionality Jan 1, 2025
@kellertuer
Copy link
Member Author

I reworked this PR again. There is now a “start-smoothing” branch to keep the current approach.

There was something I did not like anyways: Ceres does smoothing only after absolute value and squaring the residuals. That way “Huber” is more like “Huber on the sqrt” to cancel the squaring.
When we redo this again, we should be more flexible here and have $s_i(x) = \lVert x \rVert_2^2$ as the default modifier/outer function (on the blocks) and then do a proper huber (and not one on the sqrt) to replace $s_i$.

Currently these modifiers (smoothing) can not have parameters, also one thing one could improve in a new PR.
I would then also prefer the term modifier, since smoothing is not so well-phrased.

@Affie
Copy link
Contributor

Affie commented Jan 2, 2025

THere we are, everything has been tested, LM can now be used with much more diverse inputs of the Jacobian, also as gradients of the components and such, smoothing is available, though not yet 100% sure it converges. Do you, @Affie have a concrete example you wanted to test this on?

Hi, I can create a toy example in RoME, but I'm not following all the technical discussion here. It looks like smoothing was removed from this PR, is it still needed to test anything?

@mateuszbaran
Copy link
Member

Hi! We need to derive efficient update rules to add smoothing to Manopt and it appears to be more work than initially anticipated. We already have two big ongoing projects so this will take some time. The rework in the PR makes nonlinear least squares in Manopt a bit more flexible but it probably won't be useful to you until smoothing is added.

@kellertuer
Copy link
Member Author

I would still be interested in the example, but the main problem for now is that the link in the issue to Ceres and the literature in there does the rescaling/updates for non smooth objectives for the Euclidean case, but no one has ever done that on manifolds.

As Mateusz said, this is something we both find interesting and want to derive, but that is not something we have a time estimate for when this will happen, so we omit that from this PR for now. yes than this PR “only” reworks internals of LM, but that is still nice to have – just from a feature perspective, it is super boring.

@mateuszbaran
Copy link
Member

LGTM up to the n/m inconsistency 👍

@Affie
Copy link
Contributor

Affie commented Jan 3, 2025

Hi, here is a draft pr with a little toy example triangulating a point on SE2 with bearing measurements to 4 points on TranslationGroup(2) with one outlier: JuliaRobotics/RoME.jl#767

@mateuszbaran
Copy link
Member

Thanks! We will try that when working on robustness.

@kellertuer kellertuer merged commit 67b2d10 into master Jan 4, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Ready-for-Review A label for pull requests that are feature-ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants