Please restore the "weights" property for the "classif.log_reg" learner! #265

FlorianPargent · 2023-03-07T10:30:29Z

In version 0.5.4 of mlr3learners you removed the "weights" property of the "classif.log_reg" learner based on the reasoning in this thread mlr-org/mlr3pipelines#177.
I would strongly argue that the reasoning in that thread is false and that the "weights" argument in the glm function works exactly as expected and should be restored for the "classif.log_reg" learner in mlr3learners as fast as possible!

One way to see that the glm weights work as expected is this blog article.
As you can see in the blog, estimating a logistic regression model with unaggregated data (the usual case) gives exactly the same model (see the coefficient estimates and p-values) as estimating the model with aggregated data and using the number of observations per condition as weights.
Note that in glm there is an important difference between "prior weights" (the weight argument in glm) and working weights (the weights value of a fitted glm object), as explained in the documentation of glm function. In case of mlr3learners, we are only concerned with the prior weights, which work exactly as expected.

Another example which shows why restoring the weights functionality is important is my own preprint on using cost-sensitive learning with mlr3.
The benchmark in Table 3 was computed with an older version of mlr3learners when the weights property in classif.log was still available. We can clearly see that logistic regression with theoretical weighting (TW in Table 3: weights are theoretically chosen based on the cost matrix) gives about the same predictive performance as with empirical weighting (EW in Table 3: weights are empirically tuned). The similarity in predictive performance again demonstrates that the weights work as expected: I computed the theoretical weights in the usual way and empirical weighting leads to almost the same weights.

As demonstrated by our article (which is now accepted by a journal), the weights property for classif.log_reg is extremely valueable and should be restored immediately! We noticed the removal of the weights argument because we are currently preparing our manuscript for publication. Currently we are forced to use an older version of mlr3learners, which is extremely inconvenient for any readers of our tutorial-style paper. We would greatly appreciate if you could change this as soon as possible. If anybody would run our tutorial code with the current version of mlr3learners and mlr3pipelines, they would get highly misleading results! See also our related issue in the repository of mlr3pipelines: mlr-org/mlr3#931.

ck37 · 2024-04-08T13:29:01Z

It seems that #177 is the correct link to where this was discussed and changed, referencing https://stats.stackexchange.com/a/386913

I agree with the OP - observation weights are very commonly used with logistic regression, including when downsampling a large EHR dataset for ML.

FlorianPargent mentioned this issue May 22, 2023

PipeOp "classweights" should throw a warning if it is attached to a learner which does not have the "weights" property mlr-org/mlr3#931

Open

sebffischer mentioned this issue Aug 17, 2024

feat(logreg): add weights property #305

Merged

sebffischer closed this as completed in #305 Aug 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please restore the "weights" property for the "classif.log_reg" learner! #265

Please restore the "weights" property for the "classif.log_reg" learner! #265

FlorianPargent commented Mar 7, 2023 •

edited

Loading

ck37 commented Apr 8, 2024

Please restore the "weights" property for the "classif.log_reg" learner! #265

Please restore the "weights" property for the "classif.log_reg" learner! #265

Comments

FlorianPargent commented Mar 7, 2023 • edited Loading

ck37 commented Apr 8, 2024

FlorianPargent commented Mar 7, 2023 •

edited

Loading