-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Begin new RandALO API * Adds examples with API design * Adds the modeling layer * Work on reductions * Environment setup * Better imports * Begin ALO impl * First pass at ALO impl * Adds precommit * Fixes examples * starts work on jacobian * Bug fixes for RandALO * Adds first part of the Jacobian expressions * Adds Jacobian operator * Added truncnorm tests * Added utils.py tests * Basic randalo.py test * Fixes `diag` property -> method * More randalo.py tests * Fixed test rng * Updated to Ruff formatter * Fixes up the Jacobian * Loss functions now reduce themselves * + cvxpylayers dep * Begins sklearn->model impl * scikit-learn integration for regression * scikit-learn example * Adds demo to README * Added logistic regression support * Added logistic regression example * Fix ABC * Cut at cleaning up the code * Refactor * Added generic Jacobian example to README * Adds workflow file * Misc finishing work --------- Co-authored-by: Daniel LeJeune <[email protected]>
- Loading branch information
Showing
24 changed files
with
2,531 additions
and
182 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
name: Publish Python 🐍 distribution 📦 to PyPI | ||
|
||
on: push | ||
|
||
jobs: | ||
build: | ||
name: Build distribution 📦 | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v4 | ||
- name: Set up Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: "3.x" | ||
|
||
- name: Install pypa/build | ||
run: >- | ||
python3 -m | ||
pip install | ||
build | ||
--user | ||
- name: Build a binary wheel and a source tarball | ||
run: python3 -m build | ||
- name: Store the distribution packages | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: python-package-distributions | ||
path: dist/ | ||
publish-to-pypi: | ||
name: >- | ||
Publish Python 🐍 distribution 📦 to PyPI | ||
if: startsWith(github.ref, 'refs/tags/') # only publish to PyPI on tag pushes | ||
needs: | ||
- build | ||
runs-on: ubuntu-latest | ||
environment: | ||
name: pypi | ||
url: https://pypi.org/p/randalo | ||
permissions: | ||
id-token: write # IMPORTANT: mandatory for trusted publishing | ||
steps: | ||
- name: Download all the dists | ||
uses: actions/download-artifact@v4 | ||
with: | ||
name: python-package-distributions | ||
path: dist/ | ||
- name: Publish distribution 📦 to PyPI | ||
uses: pypa/gh-action-pypi-publish@release/v1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
repos: | ||
- repo: https://github.com/astral-sh/ruff-pre-commit | ||
# Ruff version. | ||
rev: 'v0.1.11' | ||
hooks: | ||
- id: ruff | ||
args: [--fix, --exit-non-zero-on-fix] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,20 +1,73 @@ | ||
# ALO Library | ||
# RandALO: fast randomized risk estimation for high-dimensional data | ||
|
||
This repository contains a software package implementing RandALO, a fast randomized method for risk estimation of machine learning models, as described in the paper, | ||
|
||
P. T. Nobel, D. LeJeune, E. J. Candès. RandALO: Out-of-sample risk estimation in no time flat. 2024. | ||
|
||
## Installation | ||
|
||
In a folder run the following: | ||
|
||
```bash | ||
git clone [email protected]:cvxgrp/randalo.git | ||
cd randalo | ||
|
||
# create a new environment with Python >= 3.10 (could also use venv or similar) | ||
conda create -n randalo python=3.12 | ||
|
||
# install requirements and randalo | ||
pip install -r requirements.txt | ||
``` | ||
git clone [email protected]:cvxgrp/alo.git | ||
cd alo | ||
|
||
# create a new environment with torch & friends (could also use conda or similar) | ||
python -m venv venv | ||
. venv/bin/activate | ||
## Usage | ||
|
||
pip install wheel | ||
pip install torch numpy scipy matplotlib | ||
### Scikit-learn | ||
|
||
pip install git+ssh://[email protected]/cvxgrp/SURE-CR.git@xtrace | ||
pip install git+ssh://[email protected]/cvxgrp/torch_linops.git | ||
pip install -e . | ||
The simplest way to use RandALO is with linear models from scikit-learn. See a longer demonstration in a notebook [here](examples/scikit-learn.ipynb). | ||
|
||
```python | ||
from torch import nn | ||
from sklearn.linear_model import Lasso | ||
from randalo import RandALO | ||
|
||
X, y = ... # load data as np.ndarrays as usual | ||
|
||
model = Lasso(1.0).fit(X, y) # fit the model | ||
alo = RandALO.from_sklearn(model, X, y) # set up the Jacobian | ||
mse_estimate = alo.evaluate(nn.MSELoss()) # estimate risk | ||
``` | ||
|
||
We currently support the following models: | ||
|
||
- `LinearRegression` | ||
- `Ridge` | ||
- `Lasso` | ||
- `LassoLars` | ||
- `ElasticNet` | ||
- `LogisticRegression` | ||
|
||
### Linear models with any solver | ||
|
||
If you prefer to use other solvers for fitting your models than scikit-learn, or if you wish to extend to other models than the ones listed above, you can still use RandALO by instantiating the Jacobian yourself. You need only be careful to ensure that you scale the regularizer correctly for your problem formulation. | ||
|
||
```python | ||
from torch import nn | ||
from sklearn.linear_model import Lasso | ||
from randalo import RandALO, MSELoss, L1Regularizer, Jacobian | ||
|
||
X, y = ... # load data as np.ndarrays as usual | ||
|
||
model = Lasso(1.0).fit(X, y) # fit the model | ||
|
||
# instantiate RandALO by creating a Jacobian object | ||
loss = MSELoss() | ||
reg = 2.0 * model.alpha * L1Regularizer() # scale the regularizer appropriately | ||
y_hat = model.predict(X) | ||
solution_func = lambda: model.coef_ | ||
jac = Jacobian(y, X, solution_func, loss, reg) | ||
alo = RandALO(loss, jac, y, y_hat) | ||
|
||
mse_estimate = alo.evaluate(nn.MSELoss()) # estimate risk | ||
``` | ||
|
||
Please refer to our [scikit-learn integration](randalo/sklearn_integration.py) source code for more examples. |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.