-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QR
decomposition method in GLM.
#507
Conversation
Adding PowerLink link function in GLM
Test was failing in Julia Nightly as: 1) GLM.linkinv(InverseLink(), 10) was 0.01 while GLM.linkinv(PowerLink(-1), 10) was 0.010000000000000002 2) GLM.linkinv(InverseSquareLink(), 10) was -10.01 while GLM.linkinv(PowerLink(-2), 10) was -0.010000000000000002 Rounding off to 2 digits should solve this. Note: These tests were passing in other versions of Julia.
…ed it for the docs
…(-2), 10 to 2 digits
Correcting order of PowerLink, ProbitLink, SqrtLink. Co-authored-by: Milan Bouchet-Valat <[email protected]>
Using ≈ instead of isapprox. Co-authored-by: Milan Bouchet-Valat <[email protected]>
Using alphabetical order in references. Co-authored-by: Milan Bouchet-Valat <[email protected]>
Writing the example description in plain text. Co-authored-by: Milan Bouchet-Valat <[email protected]>
Removing extra space to be consistent with the style. Co-authored-by: Milan Bouchet-Valat <[email protected]>
Using a better variable name for 1 by lambda. Co-authored-by: Milan Bouchet-Valat <[email protected]>
Using inline function. Co-authored-by: Milan Bouchet-Valat <[email protected]>
Making the doctest code concise. Co-authored-by: Milan Bouchet-Valat <[email protected]>
Removing the R code. Co-authored-by: Milan Bouchet-Valat <[email protected]>
Removing test of hashes. Co-authored-by: Milan Bouchet-Valat <[email protected]>
…few more metrics to test, also replaced all `=` by `≈` for real numbers
Since we are storing the link - the `GLM.Link` function can be defined uniformly for all link functions Co-authored-by: Milan Bouchet-Valat <[email protected]>
Changing the sentence to make it compact Co-authored-by: Milan Bouchet-Valat <[email protected]>
…er after d (distribution) and changed in corresponding constructor, remove updateμ! function for PowerLink and modified general updateμ! function.
@mousum-github - I can review this PR, but please note that I am not a maintainer of GLM.jl, so someone from the team working with this package should approve and merge this PR. |
Co-authored-by: Bogumił Kamiński <[email protected]>
backport of #507 --------- Co-authored-by: Mousum <[email protected]> Co-authored-by: ayushpatnaikgit <[email protected]> Co-authored-by: Milan Bouchet-Valat <[email protected]> Co-authored-by: Ayush Patnaik <[email protected]> Co-authored-by: harsharora21 <[email protected]> Co-authored-by: Bogumił Kamiński <[email protected]> (cherry picked from commit a8016bd)
* PowerLink refers to a class of techniques that use a power function * added a few more testcases * Fixing doctests for PowerLink * Making subheading smaller than heading. * Rounding off a test case result to 2 digits Test was failing in Julia Nightly as: 1) GLM.linkinv(InverseLink(), 10) was 0.01 while GLM.linkinv(PowerLink(-1), 10) was 0.010000000000000002 2) GLM.linkinv(InverseSquareLink(), 10) was -10.01 while GLM.linkinv(PowerLink(-2), 10) was -0.010000000000000002 Rounding off to 2 digits should solve this. Note: These tests were passing in other versions of Julia. * adding Optim dependancy to docs * No need to have optim as a dependancy for the entire package. Only need it for the docs * Rounding off GLM.linkinv(PowerLink(-1), 10) and GLM.linkinv(PowerLink(-2), 10 to 2 digits * Rounding off GLM.mueta(PowerLink(-1), 10) to make the test work on Julia Nightly * Using inexact equality comparison instead of rounding. * Apply suggestions from code review Correcting order of PowerLink, ProbitLink, SqrtLink. Co-authored-by: Milan Bouchet-Valat <[email protected]> * Apply suggestions from code review Using ≈ instead of isapprox. Co-authored-by: Milan Bouchet-Valat <[email protected]> * Apply suggestions from code review Using alphabetical order in references. Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update docs/src/examples.md Writing the example description in plain text. Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update docs/src/examples.md Removing extra space to be consistent with the style. Co-authored-by: Milan Bouchet-Valat <[email protected]> * Adding the missing comma. * Update src/glmtools.jl Using a better variable name for 1 by lambda. Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/glmtools.jl Using inline function. Co-authored-by: Milan Bouchet-Valat <[email protected]> * Apply suggestions from code review Making the doctest code concise. Co-authored-by: Milan Bouchet-Valat <[email protected]> * Follwing alphabetical order in references * Adding suggested changes by nalimilan * Update test/runtests.jl Removing the R code. Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update test/runtests.jl Removing test of hashes. Co-authored-by: Milan Bouchet-Valat <[email protected]> * replaced `isapprox` function by `≈` symbol whereever possible, added few more metrics to test, also replaced all `=` by `≈` for real numbers * Update src/glmfit.jl Since we are storing the link - the `GLM.Link` function can be defined uniformly for all link functions Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/glmfit.jl Changing the sentence to make it compact Co-authored-by: Milan Bouchet-Valat <[email protected]> * updated test cases, added a few more metrics to check, move link member after d (distribution) and changed in corresponding constructor, remove updateμ! function for PowerLink and modified general updateμ! function. * Rephrased the description of PowerLink * Update docs/src/examples.md Adding break line and removing full stop Co-authored-by: Milan Bouchet-Valat <[email protected]> * Removed `GLM.Link(mm::AbstractGLM) = mm.l`, redefine defination of PowerLink and updated test cases * update test case * Update src/glmtools.jl The suggestion is committed. Co-authored-by: Milan Bouchet-Valat <[email protected]> * changed the `inverselink` function for PowerLink as suggested in PR * Add files via upload NIST datasets for testing linear models * added qrpred * use qrpred instead of cholpred * removed tests which use chol * added qr decomp with weights * added dropcollinear for qr decomp * added pivot cholesky for pivot qr * added dof for qr * added chol for qr * updated test cases related to QR method * added float conversion * fixed test * updated QR decompositions * removed redundant functions * updating documention * fixed bug * refactored DensePredQR constructor * changed DensePredQR constructor to take Abstract types * removed unused functions * removed unused functions * updated doc string * conditional pivoting in QR * conditional QR pivoting * updated example of Lineam model with QR decomposotion * removed nasty.csv file and commented code * Update src/GLM.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/lm.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * some changes suggested by @nalimilan * Changed pivoted_qr to pivoted_qr! * Changed :stable to :qr and :fast to cholesky * Changed in predict function in lm for qr method, also added some testcases * re arranged all tests related to linear models (Cholesky and QR * An intermidiate commit after merging with master branch * without predict with qr test case. need to re-write predict function in lm * updated example.md file * Added test cases with predictions for QR method * Removed some comments * removed comments, replace Rsub\I by inv(Rsub) * Update test/runtests.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update test/runtests.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update test/runtests.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update test/runtests.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/GLM.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/lm.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/lm.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/lm.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/lm.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update test/runtests.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * updated test cases, qrpred method * removed extra lines in lm file * updated predict! function in lm in compact form * Optimizing the performnce for full rank matrix * changed outer constructor to inner * Incorporated QR decomposition method in glm related functions * NegativeBinomial Parameter estimation test is failing in a few system. Trying a different minstepfac value * NegativeBinomial Parameter estimation test is failing in a few system. Trying another smaller different minstepfac value * NegativeBinomial Parameter estimation test is failing in a few systemse * added one more example to show when QR method works better * update for doc test failure * Fixing doc test * Fixing doc test failure * Fixing doc test failure * Fixing for doc test failure * updated example for :qr method * Updated example for QR decomposition * Updated example for QR decomposition * Updated example for QR decomposition * Updated instead of generic error * Trying to resolve the conflict * Updated example to show QR performs better * Update src/GLM.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Update src/linpred.jl Co-authored-by: Milan Bouchet-Valat <[email protected]> * Updated test cases. Test cases with cholesky and qr methods are written in compact form * Updated test case due to different pivot in different systems * replaced equality check to egality check in glmfit * updated example and added doctest * updated example without doctest * Avoiding multiple copyies of design matrix * added one more example in qr vs cholesky * updated example, remove unused constructor and added one more test case * updated example to pass doctest * updated example to pass doctest * updated example to pass doctest * updated example to pass doctest * example * updated example with rdchem data * Update docs/src/examples.md Co-authored-by: Bogumił Kamiński <[email protected]> --------- Co-authored-by: Mousum <[email protected]> Co-authored-by: ayushpatnaikgit <[email protected]> Co-authored-by: Milan Bouchet-Valat <[email protected]> Co-authored-by: Ayush Patnaik <[email protected]> Co-authored-by: harsharora21 <[email protected]> Co-authored-by: Bogumił Kamiński <[email protected]> (cherry picked from commit a8016bd)
As a user who is getting wrong results where R and Stata get it right, I think this is a fantastic addition. |
For whatever it's worth, I think the use of |
It is well known that the
Cholesky
decomposition is fast but numerically unstable for ill-conditioned design/model matrix. Also, some users prefer theQR
decomposition method in the linear (or generalized linear) model.In this PR, we have added a keyword argument
method
in thelm
,glm
and related functions to control which decomposition method will be used. Themethod
argument takes two values:cholesky
(the default value and it usesCholesky
) and:qr
(it usesQR
).Using QR decomposition in
lm
:Issue #426
lm(model, data; method=:qr)
Coefficients:
Loglikelihood: -917.7449640609644
r2: 0.46431799718947153
adjr2: 0.454016420212346
Issue #375
lm(@formula(y ~ x), df; method=:qr, dropcollinear=false)
Coefficients:
Summary of additions/changes:
DensePredQR
- now supports both pivoted and unpivoted QR decompositioninvqr
- compute the inverse of X’WX from QR decompositioninverse
- a wrapper ofinvqr
andinvchol
, used inside thevcov
function.delbeta!(p::DensePredQR{T,<:QRCompactWY}, r::Vector{T}, wt::Vector{T})
delbeta!(p::DensePredQR{T,<:QRPivoted}, r::Vector{T})
delbeta!(p::DensePredQR{T,<:QRPivoted}, r::Vector{T}, wt::Vector{T})
dropcollinear
andmethod
keyword arguments innegbin
functionCholesky
andQR
decompositionsCholesky
andQR
decompositionsTo check and compare the performance of GLM with the QR method, we performed a logistic regression with 3,481,280 rows × 24 columns. Logistic regression took approximately (Intel i7 11th generation with 16GM RAM, Windows 11 OS),
Finally, @harsharora21 is one of the main contributors to this PR.