Computation of covariance matrix in least_squares #784
Replies: 12 comments 2 replies
-
@nkanazawa1989 I think it's like that in the sense of "what is the simplest possible thing that could work" (and can be supported, and explained easily, including to our future selves). If you are interested in exploring other inversion methods and have an approach that is more stable and robust, we would definitely be open to improving the current, simple method. |
Beta Was this translation helpful? Give feedback.
-
Fair enough. Here is a minimal example I found after bit more investigation. from lmfit.models import ExpressionModel
import numpy as np
model = ExpressionModel(expr="amp / 2 * cos((d_theta + pi / 2) * x) + base", independent_vars=["x"])
x = np.array(
[
0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.,
13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25.,
26., 27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38.,
39.
]
)
y = np.array(
[
0.9950495 , 0.44059406, 0.04455446, 0.65841584, 0.93564356,
0.2029703 , 0.15346535, 0.87623762, 0.71782178, 0.0049505 ,
0.42079208, 0.98514851, 0.56930693, 0.0049505 , 0.65841584,
0.98514851, 0.23267327, 0.12376238, 0.90594059, 0.80693069,
0.09405941, 0.44059406, 0.9950495 , 0.55940594, 0.0049505 ,
0.63861386, 0.98514851, 0.26237624, 0.05445545, 0.81683168,
0.81683168, 0.06435644, 0.33168317, 0.9950495 , 0.56930693,
0.0049505 , 0.53960396, 0.9950495 , 0.25247525, 0.04455446
]
)
sigma= np.array(
[
0.00694939, 0.04915671, 0.02042906, 0.04695685, 0.02429691,
0.03982478, 0.03568843, 0.03260657, 0.04456254, 0.00694939,
0.04888222, 0.01197666, 0.04902946, 0.00694939, 0.04695685,
0.01197666, 0.04183727, 0.03260657, 0.02890353, 0.03908181,
0.02890353, 0.04915671, 0.00694939, 0.04915671, 0.00694939,
0.0475669 , 0.01197666, 0.04355917, 0.02246785, 0.03829935,
0.03829935, 0.02429691, 0.04661791, 0.00694939, 0.04902946,
0.00694939, 0.04935183, 0.00694939, 0.04301522, 0.02042906
]
)
model.set_param_hint("amp", value=1.0, min=0.0, max=2.0)
model.set_param_hint("d_theta", value=-1.5707963267948966, min=-2.5, max=2.5)
model.set_param_hint("base", value=0.5, min=-1.0, max=1.0)
params = model.make_params()
result = model.fit(data=y, x=x, params=params, weights=1/sigma, scale_covar=False, method="least_squares") After it returned the output, I found the following unrealistic
though I obtained reasonable one with from scipy import linalg as la
def cov(jac):
_, s, VT = la.svd(jac, full_matrices=False)
threshold = np.finfo(float).eps * max(jac.shape) * s[0]
s = s[s > threshold]
VT = VT[:s.size]
return np.dot(VT.T / s**2, VT)
cov(result.jac) that returns
Interestingly, it returns reasonable covariance matrix with slightly different init value. model.set_param_hint("d_theta", value=-1.57, min=-2.5, max=2.5) This is the computed covariance matrix.
This makes me think it is caused by the ignored higher order term in Hessian. Perhaps it it difficult to reproduce this in your environment. This is my environment for your information.
Currently I'm testing the capability of this package as a part of another opensouce software. |
Beta Was this translation helpful? Give feedback.
-
@nkanazawa1989 Thanks. First, using SVD or some other method for inverting the curvature matrix is worth pursuing - I would expect that it would be especially helpful when one of the variables hits a boundary. But, I don't think the example result is that meaningful: on the example For sure, we're going to see the following as warning signs of not using lmfit well:
It turns out that your example sort of does all of those. We hope you'll avoid doing these. You also mention that giving It also turns out that fitting of sinusoidal functions can be challenging when the frequency is far off. The fits just won't converge to find the frequency when it is off. For reference, here f_guess = 2*np.pi*(10/37.) # I count 10 full periods to x = 37 when I plotted the spectra.
d_theta_guess = f_guess - np.pi/2 # match what the model wants for frequency
params = model.make_params(amp=1, base=0.5, d_theta=d_theta_guess) # don't bother with bounds
result = model.fit(data=y, x=x, params=params, weights=1/sigma, method='least_squaes')
print(result.fit_report()) This will print out a sensible result:
You can try other methods (leastsq, nelder-mead) -- they'll give the same results to at least 4 digits - the covariance is not unstable near the solution. Plots will show that the fits are good too. on the concept And, again, I think a place where SVD is more likely to be helpful is when one of the variables hits a boundary. |
Beta Was this translation helpful? Give feedback.
-
@nkanazawa1989 wait, what?
Is that relevant to this discussion here,? You did not mention this earlier.
Yes they do:
The uncertainties package is used by lmfit to calculate uncertainties in "constraint expressions" after the fit is done and correlations computed, and those calculations use both the uncertainties and the correlations.
I don't understand this. What is the "problem" in "this problem should be solved by avoiding computation of AffineScalarFunc values, rather than computing covariance". I may not be parsing this correctly. Lmfit is not going to be deleting the covariance matrix. It calculates uncertainties and correlations from this. You need to describe the actual problem you are actually having. Again, other approaches to inverting the covariance are worth exploring, but I kind of think maybe you are making a few assumptions about how lmfit works that just aren't so. |
Beta Was this translation helpful? Give feedback.
-
Sorry it seems like I completely misunderstood how LMFIT Parameter works. Then probably I can remove usage of uncertainties from my code and delegate post-processing of fit parameters to LMFIT Parameters rather than converting them into AffineScalarFunc.
I had a code something like try:
params = uncertainties.correlated_values(list(result.params.values()), result.covar)
except Exception:
jac = result.jac
# compute with svd approach
_, s, VT = la.svd(jac, full_matrices=False)
threshold = np.finfo(float).eps * max(jac.shape) * s[0]
s = s[s > threshold]
VT = VT[:s.size]
new_covar = np.dot(VT.T / s**2, VT)
params = uncertainties.correlated_values(list(result.params.values()), new_covar) but it doesn't make sense because usually the fitting is off when the covariance matrix involves negative diagonal elements. So instead of handling exception, it would make more sense to just not compute standard errors.
If I understand the code correctly, it doesn't compute uncertainties and correlations when if hasattar(self, "covar") and np.any(np.diag(self.covar) < 0):
delattar(self, "covar") Then it clearly indicates that |
Beta Was this translation helpful? Give feedback.
-
@nkanazawa1989 Neither uncertainties nor correlations are useful or really mathematically valid unless you are at (or at least "near") the solution. Covariances for obviously bad fits are just not interesting. First, you have to determine if the fit is good. Then (and only then) are the uncertainties or correlations meaningful. It's fine to propose to change the way the covariance matrix inversion is done. But, you kind of need to show a case where a) the current method fails, b) the proposed method succeeds, and c) make a good case that the successful calculation is close to right. Here, that means that you need an example of a fit that succeeds (at least for some variables) and gives a reasonably good fit, but for which error bars are not calculated. Do you have such a case? Without identifying the problem to be solved, it's hard to distinguish between "solution" and "change". Yep, we set Again, we do use for par in result.params.values():
par.uval = uncertainties.ufloat(par.value, par.stderr) Those do include the correlations and can be calculated for "constraint expression" or derived quantities such as the "height" of a Gaussian. Maybe that would be sufficient for what you need? If not, please explain what you are trying to do. Proposing solutions comes after identifying problems. So far, you proposed a solution but have not identified a problem. |
Beta Was this translation helpful? Give feedback.
-
Yes, I agree. But it is bit confusing that LMFIT fit outcome with
No. I guess all the cases I received the ill covariance matrix so far were attribute to significant residuals, indicating the bad fit. Probably current implementation is fine.
I cannot find the attribute
The reason I need covariance matrix is to recreate correlated ufloats from loaded fit data. |
Beta Was this translation helpful? Give feedback.
-
"success" means the fit thinks it succeeded. "errorbars" means that uncertainties could be calculated.
Hm. I highly recommend always reading the fit report and using data visualization tools.
You can just run that code yourself. I suppose we could add it, but most people will not use it and it is easy enough to do after the fact.
Not sure what you mean... I suggest reading the code.
Lmfit Parameters do have uncertainties that include correlations, and the constrained parameters are evaluated with these correlations. Parameters can be serialized. |
Beta Was this translation helpful? Give feedback.
-
I read the code but I don't think the LMFIT Parameter supports proper error propagation. Indeed it overrides several dunder methods for math, but they all call For example: (1) LMFIT parameter from lmfit import Parameter
p0 = Parameter(name="p0", value=1.0)
p0.stderr = 0.1
p0.correl = {"p1": 0.4}
p1 = Parameter(name="p1", value=2.0)
p1.stderr = 0.1
p1.correl = {"p0": 0.4}
p0 + p1 # this returns just 3.0 (2) Just taking value and stderr with ufloats import uncertainties
p0 = uncertainties.ufloat(1.0, 0.1)
p1 = uncertainties.ufloat(2.0, 0.1)
p0 + p1 # this returns 3.0+/-0.14142135623730953 Here the error is propagated but no correlation is considered. (3) Consider covariance matrix with ufloats import uncertainties
p0, p1 = correlated_values([1.0, 2.0], covariance_mat=np.array([[0.01, 0.004], [0.004, 0.01]]))
p0 + p1 # this returns 3.0+/-0.1673320053068151
p0, p1 = correlated_values([1.0, 2.0], covariance_mat=np.array([[0.01, -0.004], [-0.004, 0.01]]))
p0 + p1 # this returns 3.0+/-0.10954451150103321 Note that positive correlation enhances stdev while negative correlation reduces it. This is important since when we fit data from real experiments we cannot always scan experiment parameter in sufficient range (due to the limitation of experiment time, hardware capability, etc...), and sometimes it yields unsatisfactory separation of parameters in the fit model. |
Beta Was this translation helpful? Give feedback.
-
@nkanazawa1989 As above, I think you are not understanding me or lmfit. I have asked you multiple times to explain what you are trying to do, and what problem case you are trying to solve. You continue to not provide code that shows a problem, and continue to change the subject. I am losing interest. As I said above, you have to make ufloats yourself. You did not do that. Yeah, you would need to do something like this: import numpy as np
import uncertainties as un
from lmfit import Parameter
p0 = Parameter(name="p0", value=1.0)
p0.stderr = 0.1
p0.correl = {"p1": 0.4}
p1 = Parameter(name="p1", value=2.0)
p1.stderr = 0.1
p1.correl = {"p0": 0.4}
u0 = un.ufloat(p0.value, p0.stderr)
u1 = un.ufloat(p1.value, p1.stderr)
print("# With uncertainties, without correlations:")
print("u0, u1 =", u0, u1)
print("u0 + u1 =", u0 + u1)
print("# With uncertainties and with correlations:")
covar = np.array([[0.01, 0.004], [0.004, 0.01]])
v0, v1 = un.correlated_values([p0.value, p1.value], covar)
print("v0, v1 =", v0, v1)
print("correl =", covar[1,0]/(np.sqrt(covar[1,1]*covar[0,0])))
print("v0 + v1 =", v0+v1) Your toy example did not even do a fit -- you did not get a covariance array at all. You made one up. That is literally not using lmfit and definitely is completely avoiding the discussion topic that you started about the calculation of the covariance matrix. Here, we are definitely going to discuss lmfit. If you are going to suggest changes to how the correlation is calculated, we are going to expect that you understand how to use the library. If you want to continue this discussion, it is going to have to be about lmfit, and you are going to have to provide code (classic minimal working example) that actually shows an actual problem that you actually have. |
Beta Was this translation helpful? Give feedback.
-
Thanks @newville . You gave me the answer to all of my questions. This is what I wanted to know/confirm:
Honestly it is difficult to provide a simple example code since I'm trying to use LMFIT as a part of another complicated software, but basically this is what I wanted to run with LMFIT (just a toy). import numpy as np
import matplotlib.pyplot as plt
from lmfit.models import ExpressionModel
amp = 0.5
tau = 0.6
base = 0.3
x = np.linspace(0, 1)
y = amp * np.exp(-x/tau) + base + 0.01 * np.random.randn(x.size)
model = ExpressionModel(expr="amp * exp(-x/tau) + base", independent_vars=["x"])
outcome = model.fit(amp=0.5, tau=0.6, base=0.3, x=x, data=y)
new_value = outcome.params["amp"] + outcome.params["base"] I expected to have standard error of uamp, ubase = un.correlated_values([outcome.params["amp"].value, outcome.params["base"].value], outcome.covar)
new_uvalue = uamp + ubase gives what I am expecting. Sometimes I get
With this check LMFIT will work nicely with my codebase. For now updating computation of covariance with svd approach seems to me kind of overengineering since the quality of the fitting is (expected to be) not good when I get the error. Again, thank you so much for quick response and careful explanation of the behavior of the software. I think LMFIT is great software! |
Beta Was this translation helpful? Give feedback.
-
If you cannot provide example code, then how do you expect anyone to understand what you are trying to do? Like, honestly, you keep changing the subject and keep making incorrect assertions about lmfit.
If you had used the library as documented, this would be trivial to do. |
Beta Was this translation helpful? Give feedback.
-
Does anyone know why covariance matrix in
MinimizerResult.least_squares
is computed like this?lmfit-py/lmfit/minimizer.py
Lines 1601 to 1602 in b930dde
I think it is known that this approach is not robust to numerical error especially when residual cannot be ignored and sometimes it yields covariance matrix including negative numbers in diagonal element.
https://jp.mathworks.com/matlabcentral/answers/465557-calculating-covariance-matrix-from-jacobian-using-lsqcurvefit
Indeed scipy
curve_fit
conforms to the svd-based approach which is likely robust according to above discussion.https://github.com/scipy/scipy/blob/ce8d3d854eaf9b9c2f86a686e9c2526361c63309/scipy/optimize/_minpack_py.py#L860-L864
I sometime run into minimizer result with
.errorbars=False
due to such ill covariance matrix (I cannot provide particular code example to reproduce -- but I can investigate if needed).Thank you!
Beta Was this translation helpful? Give feedback.
All reactions