Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging recent changes into main #89

Merged
merged 161 commits into from
Dec 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
161 commits
Select commit Hold shift + click to select a range
f3ab94f
Added a (yet untested) learnable TPE layer.
mmcdermott Jul 27, 2023
412ceb9
Merge branch 'dev' into learnable_frequency_ATEs
mmcdermott Jul 27, 2023
3826253
Added a unihpf style linearization / new HF dataset output prelim not…
mmcdermott Jul 31, 2023
99ad1dd
Adding flat summarization methods to dataset; yet untested.
mmcdermott Aug 1, 2023
def1994
tested FT baseline data generation.
mmcdermott Aug 1, 2023
fd79108
Some small bug fixes.
mmcdermott Aug 1, 2023
1c4851a
Merged dev.
mmcdermott Aug 1, 2023
83cbf99
Merge pull request #48 from mmcdermott/learnable_frequency_ATEs
mmcdermott Aug 1, 2023
05b9e5d
Merge branch 'dev' into fine_tuning_task_baselines
mmcdermott Aug 1, 2023
c32d0b6
Improved param revisions and fixed current loading method (though sav…
mmcdermott Aug 2, 2023
61491db
Added starter for modeling code (commented out for now)
mmcdermott Aug 2, 2023
83fc3c6
Merge branch 'fine_tuning_task_baselines' of github.com:mmcdermott/Ev…
mmcdermott Aug 2, 2023
8adf790
Add streaming flag to collect operations
juancq Aug 3, 2023
dfa9583
Add flag for using pyarrow when writing parquets
juancq Aug 3, 2023
ac6dfd0
Added flat representations and sklearn modeling examples.
mmcdermott Aug 3, 2023
ee13e85
Merge branch 'dev' into streaming
juancq Aug 4, 2023
86dff18
Merge branch 'dev' into write_parquet_pyarrow
juancq Aug 4, 2023
36c2302
Got flat data sklearn pipes working.
mmcdermott Aug 4, 2023
c50e932
Added some documentation to FT code.
mmcdermott Aug 6, 2023
c283ff1
Merge pull request #50 from juancq/streaming
mmcdermott Aug 6, 2023
92db042
Merge branch 'dev' into write_parquet_pyarrow
mmcdermott Aug 6, 2023
485a3a8
Merge pull request #51 from juancq/write_parquet_pyarrow
mmcdermott Aug 6, 2023
a08baa5
Add missing type to train_subset_size.
mmcdermott Aug 6, 2023
be49382
Added documentation on how to control max threadpool size to address #47
mmcdermott Aug 6, 2023
bb631bf
Support for measurements with '/'s in them.
mmcdermott Aug 7, 2023
17692ef
Added train set subsampling.
mmcdermott Aug 8, 2023
ab8d0ee
Partial work in progress.
mmcdermott Aug 10, 2023
e020a17
Added script.
mmcdermott Aug 10, 2023
5f20eaa
Correcting small typo.
mmcdermott Aug 11, 2023
636c289
Initial files; not yet working.
mmcdermott Aug 11, 2023
6ba6452
Re-structured fine-tuning config to better enable FT hyperparameter s…
mmcdermott Aug 11, 2023
5b70417
Fixing documentation error.
mmcdermott Aug 14, 2023
10c9517
Small change to ensure project is carried over.
mmcdermott Aug 14, 2023
2c7bc86
Merge branch 'dev' into hyperparameter_tune_FT_runs
mmcdermott Aug 14, 2023
7994a26
Fixed errors with FT changes.
mmcdermott Aug 14, 2023
799b8ef
Updated parameters configs.
mmcdermott Aug 14, 2023
1d71c17
Re-arranged num_dataloader_workers
mmcdermott Aug 14, 2023
4f1bbcf
Updated hyperparameter default search distribution.
mmcdermott Aug 15, 2023
f54a0f6
Made a bunch of small modifications so hyperparameter tuning works.
mmcdermott Aug 15, 2023
74b1c3b
Removed pre-built synthetic dataset as path errors preclude usage. Us…
mmcdermott Aug 15, 2023
1b5ef7b
Corrected small typo in example synthetic notebook.
mmcdermott Aug 15, 2023
7f0528e
Updated synthetic tutorial to be runnable from jupyter notebook and t…
mmcdermott Aug 15, 2023
8f288bf
Numpy doesn't work like that, so removed it.
mmcdermott Aug 16, 2023
9d218b9
Merge branch 'dev' into unihpf_support
mmcdermott Aug 16, 2023
051e8fb
Tentative new schema.
mmcdermott Aug 16, 2023
7e307b2
merged dev.
mmcdermott Aug 16, 2023
5dc2a7d
Merge branch 'dev' into hyperparameter_tune_FT_runs
mmcdermott Aug 16, 2023
7f5e08f
Formatted files.
mmcdermott Aug 16, 2023
d72fc7e
Added module docstring for embedding.py
mmcdermott Aug 16, 2023
d8d1965
Improved load flat rep to be faster / more memory efficient.
mmcdermott Aug 17, 2023
0c749ae
SHouldn't have overused n_samples.
mmcdermott Aug 17, 2023
a27254e
Added poetry build setup and installation dependencies. May be workin…
mmcdermott Aug 17, 2023
911acfa
Updated things so that after pip install and installation of jupyter …
mmcdermott Aug 17, 2023
7d93ef9
Adding support for a different approach.
mmcdermott Aug 18, 2023
818a626
Made wandb sklearn stuff runnable.
mmcdermott Aug 18, 2023
2c1df31
Merge branch 'hyperparameter_tune_FT_runs' into fine_tuning_task_base…
mmcdermott Aug 18, 2023
ec990e0
Merging.
mmcdermott Aug 18, 2023
4badadc
Starting to set-up for hyperparameter tuning.
mmcdermott Aug 18, 2023
11d0f29
Trying to get a wandb sweep working with sklearn models.
mmcdermott Aug 18, 2023
eef782b
Updated install instructions to use poetry, not conda.
mmcdermott Aug 18, 2023
71fe5b6
Attempt #2
mmcdermott Aug 18, 2023
93f28e0
Added pytest-cov
mmcdermott Aug 18, 2023
dc67a17
Attempt #3 to get workflows working again.
mmcdermott Aug 18, 2023
e5aeaf3
Added missing dependency.
mmcdermott Aug 18, 2023
aed6ab5
Updated lock.
mmcdermott Aug 18, 2023
b6ffbe5
one more time.
mmcdermott Aug 18, 2023
c68f58d
Attempting to set-up installation options for multiple versions.
mmcdermott Aug 18, 2023
faeff60
Merge pull request #55 from mmcdermott/hyperparameter_tune_FT_runs
mmcdermott Aug 18, 2023
f7dcdbc
Merge remote-tracking branch 'origin/dev' into simpler_install_enviro…
mmcdermott Aug 18, 2023
777ffa9
Trying a different approach.
mmcdermott Aug 18, 2023
e10c975
Added wandb
mmcdermott Aug 18, 2023
48bc7ad
Merge pull request #56 from mmcdermott/simpler_install_environment
mmcdermott Aug 18, 2023
3c85b0b
Updated install instructions.
mmcdermott Aug 18, 2023
8fe3e56
Got sklearn hyperparameter tuning working-ish
mmcdermott Aug 19, 2023
2fcf2cb
Merge branch 'dev' into fine_tuning_task_baselines
mmcdermott Aug 20, 2023
4429c92
Removed old erroneous script.
mmcdermott Aug 20, 2023
7b5b1d3
Removed old erroneous script and fixing pre-commit issues.
mmcdermott Aug 20, 2023
5268c4c
Added scipy and scikit-learn.
mmcdermott Aug 21, 2023
3a8e3c6
Added content to the tutorial about when data is dropped.
mmcdermott Aug 21, 2023
2d990d9
Extending pytorch batch documentation.
mmcdermott Aug 21, 2023
1862852
Trying to fix doc generation env.
mmcdermott Aug 21, 2023
dddedb6
Using bibliography for link for Learnable sinsuoidal embeddings.
mmcdermott Aug 21, 2023
cbe2b08
Added more MIMIC FT examples.
mmcdermott Aug 21, 2023
114f75d
Added more MIMIC FT examples.
mmcdermott Aug 21, 2023
355e523
Fixed pre-commit.
mmcdermott Aug 21, 2023
10be263
Added some more tests and doctests.
mmcdermott Aug 22, 2023
ca157be
Attempting to test the running of the sample notebook.
mmcdermott Aug 22, 2023
d03e6ec
Merge pull request #57 from mmcdermott/fine_tuning_task_baselines
mmcdermott Aug 22, 2023
c49d749
Merge branch 'dev' into test_sample_notebook
mmcdermott Aug 22, 2023
eebc754
Merged with dev.
mmcdermott Aug 22, 2023
acae11a
Removed pre-built unihpf sample data.
mmcdermott Aug 22, 2023
a2af94d
Re-setting the labs sample data.
mmcdermott Aug 22, 2023
738f6e4
Added the ability to test sample notebook tutorial code.
mmcdermott Aug 22, 2023
6d5bfd8
Merge pull request #59 from mmcdermott/test_sample_notebook
mmcdermott Aug 22, 2023
bfe6eb2
Updating config to handle cases where outliers or normalizers are null.
mmcdermott Aug 22, 2023
772b06d
Modified finetuning so that it loads the best model (lowest tuning lo…
Aug 22, 2023
a605122
Update README.md
mmcdermott Aug 23, 2023
05c243b
Merge branch 'dev' into unihpf_support
mmcdermott Aug 23, 2023
63d209f
Fixed tests and lint errors.
mmcdermott Aug 23, 2023
7de8bc5
Fixed doctest error.
mmcdermott Aug 23, 2023
102d93f
Added more test cases for caching/uncaching measurement configs.
mmcdermott Aug 23, 2023
4853d99
fix valid_loss to tuning loss in model checkpointingand config path t…
Aug 23, 2023
a3f10b9
Added another collect-lazy step and removed dtype shrinking to elimin…
mmcdermott Aug 25, 2023
26f9adc
Fixed small bug in task baseline numbers.
mmcdermott Aug 25, 2023
7c2b9bd
Merge branch 'dev' of github.com:mmcdermott/EventStreamML into dev
mmcdermott Aug 25, 2023
ffa2c95
Added missing comma.
mmcdermott Aug 25, 2023
28cedd0
Adding some missing documentation.
mmcdermott Aug 25, 2023
b11f14f
Corrected small error in link syntax.
mmcdermott Aug 25, 2023
fe0f38b
Merge branch 'dev' into fine_tuning_task_baselines
mmcdermott Aug 26, 2023
93a05d3
Some small revisions.
mmcdermott Aug 26, 2023
45f295f
Adjusted one other thing in prepare pretrain subsets.
mmcdermott Aug 26, 2023
8d4e112
Made AUC compute on CPU, fixed do_compute_only_loss, made pre-trainin…
mmcdermott Aug 27, 2023
a9850e9
Added e2e tests through the script endpoints.
mmcdermott Aug 28, 2023
e0d5366
Updated gitignore.
mmcdermott Aug 28, 2023
d38265b
Merge pull request #63 from mmcdermott/e2e_tests
mmcdermott Aug 28, 2023
f4ed0cc
Adding package to workflow.
mmcdermott Aug 28, 2023
2ba4ff3
Merge branch 'e2e_tests' into fine_tuning_task_baselines
mmcdermott Aug 28, 2023
f0106f7
Merge pull request #62 from mmcdermott/fine_tuning_task_baselines
mmcdermott Aug 28, 2023
fc3c10b
Added more e2e tests.
mmcdermott Aug 30, 2023
3734bcd
Fixed some pre-commit issues.
mmcdermott Aug 30, 2023
2042d1f
Updated pre-commit action version.
mmcdermott Aug 30, 2023
bddfe2c
Updated pre-commit action version.
mmcdermott Aug 30, 2023
d1e4f3e
New pre-commit versions.
mmcdermott Aug 30, 2023
4ff8175
Adding explicit typing throughout flat representation.
mmcdermott Aug 30, 2023
747fcb9
Hopefully fixed test and pre-commit error.
mmcdermott Aug 30, 2023
2936612
Merge pull request #64 from mmcdermott/e2e_tests
mmcdermott Aug 30, 2023
dc812e1
Partial changes. Merging dev changes into this branch next.
mmcdermott Aug 30, 2023
2dd3283
Merge branch 'dev' into flat_representation_types
mmcdermott Aug 30, 2023
1a09bce
Re-worked the flat representation storage stuff a fair bit for better…
mmcdermott Aug 30, 2023
62d7c97
Fixed tests.
mmcdermott Aug 31, 2023
b1267d1
Merge pull request #65 from mmcdermott/flat_representation_types
mmcdermott Aug 31, 2023
c3a7697
Merged.
mmcdermott Aug 31, 2023
b5a236a
Corrected a test.
mmcdermott Aug 31, 2023
5553472
Merge pull request #58 from mmcdermott/unihpf_support
mmcdermott Aug 31, 2023
bb689ae
Added intermediate size to default PT hyperparameter search config.
mmcdermott Aug 31, 2023
7e8cdda
Made some small deviations to further speed things up. Is mostly work…
mmcdermott Sep 3, 2023
29c29f1
Address any settings where there are no static events for #66.
mmcdermott Sep 4, 2023
8b9ce7f
Fixed a small bug with compiling flat representations.
mmcdermott Sep 5, 2023
8d4071c
pre-commit lint changes.
mmcdermott Sep 6, 2023
2c09844
Merge branch 'dev' into faster_flat_reps
mmcdermott Sep 6, 2023
6ff0904
Fixed a small bug with feature existence in computing mean/var features.
mmcdermott Sep 6, 2023
d6b442a
Re-arranged some things, and added capability to cache task specific …
mmcdermott Sep 6, 2023
ed6e38e
Fixed a small bug with update rules for loading flat data and improve…
mmcdermott Sep 6, 2023
422e57a
Made cell not raise exception as nbmake was still failing the test.
mmcdermott Sep 6, 2023
d9a1b17
Merge pull request #69 from mmcdermott/cache_task_flat_reps
mmcdermott Sep 6, 2023
46649a9
Merge branch 'dev' into fix_static_data_bug
mmcdermott Sep 6, 2023
834c850
Forgot to uncomment some test lines.
mmcdermott Sep 6, 2023
e5e9892
Merge pull request #67 from mmcdermott/fix_static_data_bug
mmcdermott Sep 6, 2023
2137970
Updated docs theme to use sphinx immaterial.
mmcdermott Sep 6, 2023
661ac26
Allow the exception to be raised naturally; fixed the error with cell…
mmcdermott Sep 7, 2023
a3b8e7b
Changed dependency management structure to use pip for installing doc…
mmcdermott Sep 7, 2023
9b23fb9
Added missing build tool to readthedocs config.
mmcdermott Sep 7, 2023
900639e
Updated synthetic tutorial notebook.
mmcdermott Sep 18, 2023
f7056cb
Fixed mutable defaults.
mmcdermott Sep 20, 2023
8ca8cd0
Fixes #72 by correcting spelling of measurement
mmcdermott Oct 16, 2023
b10e741
Merge branch 'py311' into dev
mmcdermott Oct 16, 2023
340f8ac
Fixed bug causing the system to ignore cached files for task represen…
mmcdermott Oct 24, 2023
41481eb
Fixed one more error with the same thing.
mmcdermott Oct 24, 2023
60fcf41
... fixed a third bug associated with the second bug fix related to m…
mmcdermott Oct 24, 2023
e620eed
Fixed broken nbmake test. Not sure why the raises-exception was not w…
mmcdermott Oct 25, 2023
b938359
Added some basic model tutorial informaiton and video about model arc…
mmcdermott Dec 14, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/code-quality-main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,10 @@ jobs:

steps:
- name: Checkout
uses: actions/checkout@v2
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v3

- name: Run pre-commits
uses: pre-commit/action@v2.0.3
uses: pre-commit/action@v3.0.0
6 changes: 3 additions & 3 deletions .github/workflows/code-quality-pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ jobs:

steps:
- name: Checkout
uses: actions/checkout@v2
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v3

- name: Find modified files
id: file_changes
Expand All @@ -31,6 +31,6 @@ jobs:
run: echo '${{ steps.file_changes.outputs.files}}'

- name: Run pre-commits
uses: pre-commit/action@v2.0.3
uses: pre-commit/action@v3.0.0
with:
extra_args: --files ${{ steps.file_changes.outputs.files}}
50 changes: 11 additions & 39 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
strategy:
fail-fast: false

timeout-minutes: 20
timeout-minutes: 30

steps:
- name: Checkout
Expand All @@ -24,48 +24,20 @@ jobs:
with:
python-version: "3.10"

- name: Add conda to system path
- name: Install packages
run: |
echo $CONDA/bin >> $GITHUB_PATH
- name: Install dependencies
run: |
conda env update --file env_cpu.yml --name base
pip install pytest
pip install sh

- name: List dependencies
run: |
conda list

- name: Run pytest
run: |
pytest -v --doctest-modules --ignore docs/

# upload code coverage report
code-coverage:
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v2

- name: Set up Python 3.10
uses: actions/setup-python@v2
with:
python-version: "3.10"

- name: Add conda to system path
run: |
echo $CONDA/bin >> $GITHUB_PATH
- name: Install dependencies
run: |
conda env update --file env_cpu.yml --name base
pip install -e .
pip install pytest
pip install pytest-cov[toml]
pip install sh
pip install nbmake
pip install rootutils

- name: Run tests and collect coverage
run: pytest --doctest-modules --cov EventStream --ignore docs/
#----------------------------------------------
# run test suite
#----------------------------------------------
- name: Run tests
run: |
pytest -v --doctest-modules --cov EventStream --ignore docs/ --nbmake

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -140,3 +140,8 @@ logfile
tags

docs/_collections/*
docs/bin

sample_data/processed/*

outputs
27 changes: 13 additions & 14 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
default_language_version:
python: python3

exclude: "sample_data"
exclude: "sample_data|docs/MIMIC_IV_tutorial/wandb_reports"

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
Expand All @@ -19,11 +19,10 @@ repos:
- id: check-case-conflict
- id: check-added-large-files
args: [--maxkb, "800"]
exclude: "sample_data"

# python code formatting
- repo: https://github.com/psf/black
rev: 23.1.0
rev: 23.7.0
hooks:
- id: black
args: [--line-length, "110"]
Expand All @@ -33,30 +32,30 @@ repos:
rev: 5.12.0
hooks:
- id: isort
args: ["--profile", "black", "--filter-files"]
args: ["--profile", "black", "--filter-files", "-o", "wandb"]

- repo: https://github.com/PyCQA/autoflake
rev: v2.1.1
rev: v2.2.0
hooks:
- id: autoflake

# python upgrading syntax to newer version
- repo: https://github.com/asottile/pyupgrade
rev: v3.3.1
rev: v3.10.1
hooks:
- id: pyupgrade
args: [--py310-plus]

# python docstring formatting
- repo: https://github.com/myint/docformatter
rev: v1.5.1
rev: v1.7.5
hooks:
- id: docformatter
args: [--in-place, --wrap-summaries=110, --wrap-descriptions=110]

# python check (PEP8), programming errors and code complexity
- repo: https://github.com/PyCQA/flake8
rev: 6.0.0
rev: 6.1.0
hooks:
- id: flake8
args:
Expand All @@ -73,21 +72,21 @@ repos:

# yaml formatting
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v3.0.0-alpha.6
rev: v3.0.3
hooks:
- id: prettier
types: [yaml]
exclude: "environment.yaml"

# shell scripts linter
- repo: https://github.com/shellcheck-py/shellcheck-py
rev: v0.9.0.2
rev: v0.9.0.5
hooks:
- id: shellcheck

# md formatting
- repo: https://github.com/executablebooks/mdformat
rev: 0.7.16
rev: 0.7.17
hooks:
- id: mdformat
args: ["--number"]
Expand All @@ -102,11 +101,11 @@ repos:

# word spelling linter
- repo: https://github.com/codespell-project/codespell
rev: v2.2.4
rev: v2.2.5
hooks:
- id: codespell
args:
- --skip=logs/**,data/**,*.ipynb,*.bib,env.yml,env_cpu.yml,*.svg
- --skip=logs/**,data/**,*.ipynb,*.bib,env.yml,env_cpu.yml,*.svg,poetry.lock
- --ignore-words-list=ehr

# jupyter notebook cell output clearing
Expand All @@ -117,7 +116,7 @@ repos:

# jupyter notebook linting
- repo: https://github.com/nbQA-dev/nbQA
rev: 1.6.3
rev: 1.7.0
hooks:
- id: nbqa-black
args: ["--line-length=110"]
Expand Down
13 changes: 8 additions & 5 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,18 @@ version: 2
build:
os: ubuntu-22.04
tools:
python: "mambaforge-4.10"
python: "3.10"

python:
install:
- method: pip
path: .
extra_requirements:
- docs

# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py

# Optionally build your docs in additional formats such as PDF
# formats:
# - pdf

conda:
environment: env_cpu.yml
Loading
Loading