Skip to content

Latest commit

 

History

History
2279 lines (1506 loc) · 150 KB

CHANGELOG.md

File metadata and controls

2279 lines (1506 loc) · 150 KB

Changelog

v0.209.0 (2024-05-15)

Feature

v0.208.1 (2024-05-13)

Fix

  • Remove group_col_name (dw_ek_borger) in training data in split trainer (3b3ee4a)

v0.208.0 (2024-05-03)

Feature

  • Allow passing custom populate registry fn to hparam search (244231d)

v0.207.1 (2024-05-03)

Fix

  • Preprocess additinoal data in selective cv (f50d22a)

v0.207.0 (2024-05-03)

Feature

  • Add selective cross validator trainer (fd9b713)

Fix

v0.206.0 (2024-04-26)

Feature

v0.205.0 (2024-04-16)

Feature

Fix

v0.204.0 (2024-04-10)

Feature

Fix

v0.203.1 (2024-03-19)

Fix

v0.203.0 (2024-03-18)

Feature

Fix

v0.202.0 (2024-03-12)

Feature

  • Add runpath suggester and remove run path from projectinfo (faf7d2d)

v0.201.0 (2024-03-12)

Feature

  • Upload pred df to mlflow (af15191)

v0.200.0 (2024-03-11)

Feature

  • Unify naming classification model steps (31fc24a)

v0.199.1 (2024-03-08)

Fix

  • Study creation before parallelization (a4126af)

v0.199.0 (2024-03-07)

Feature

Fix

v0.198.0 (2024-02-29)

Feature

v0.197.0 (2024-02-29)

Feature

  • Refactor mlflow client interface (d346aa9)
  • Mlflow artifact downloader (859e60d)

v0.196.0 (2024-02-27)

Feature

  • #834: T2d add MCC and F1 to results table (a306f8b)
  • Create descriptive_stats_by_outcome.py (fe9d23e)

v0.195.0 (2024-02-27)

Feature

  • Generate features with tsflattener v2 (759fcc0)

v0.194.0 (2024-02-26)

Feature

v0.193.0 (2024-02-15)

Feature

  • Add pse keyword count embedding (6cbdb42)

v0.192.0 (2024-02-07)

Feature

  • Get best run from experiment (8036058)
  • Tool for extracting metrics from mlflow (ce51700)

v0.191.0 (2024-02-07)

Feature

  • Feature gen for sczbp text experiment (13a3622)

v0.190.0 (2024-02-06)

Feature

  • Lightgbm suggester (7b9569a)
  • Joblib based hyperparameter search (4df2dec)
  • #789: CVD, hyperparam tune for layer 2 (075313d)
  • Lightgbm suggester (feb6a68)

v0.189.0 (2024-02-06)

Feature

v0.188.0 (2024-02-06)

Feature

v0.187.0 (2024-02-06)

Feature

v0.186.0 (2024-01-31)

Feature

Fix

v0.185.1 (2024-01-30)

Performance

v0.185.0 (2024-01-29)

Feature

Fix

  • #780: Terminallogger pretty print should print non-flattened cfg (#783) (7abd94a)
  • #780: Terminallogger pretty print should print non-flattened cfg (356ca28)
  • Fix c/b plot (2594b54)

v0.184.0 (2024-01-29)

Feature

  • #772: Use rich for pretty printing in terminallogger (#777) (c924a95)
  • #772: Use rich for pretty printing in terminallogger (48cff23)

Fix

  • #775: Log outcome column as string, not Index (a4ff66b)

v0.183.0 (2024-01-26)

Feature

  • Do not init mlflow until first logging operation (9085833)

v0.182.1 (2024-01-24)

Fix

  • #748: Mlflow creates new logging file for each logging operation (#749) (46fa246)

v0.182.0 (2024-01-23)

Feature

v0.181.0 (2024-01-23)

Feature

  • Test effect of interval (4dbafb5)

Fix

  • Mlflow logger on overtaci (a0eef17)

v0.180.0 (2024-01-22)

Feature

  • #700: Pretrain and finetune a sequential model for T2D (470b0bc)

v0.179.0 (2024-01-18)

Feature

v0.178.0 (2024-01-18)

Feature

v0.177.1 (2024-01-18)

Fix

  • #717: Remove-PredictionTimeFilterer (#719) (a6e3a2d)
  • Remove PredictionTimeFilterer (bd3e3f0)
  • #717: Remove-PredictionTimeFilterer (0bcf8bd)

v0.177.0 (2024-01-17)

Feature

  • Cohortdefiners return validatedframes (#711) (978e84d)

Fix

v0.176.0 (2024-01-17)

Feature

v0.175.0 (2024-01-17)

Feature

  • Match suggester regex in hparam tuning (be443f4)
  • Hyperparameter tuning with optuna (679dffb)
  • #699: Optuna-hparam-optimization (7773099)

v0.174.0 (2024-01-17)

Feature

Fix

v0.173.0 (2024-01-17)

Feature

  • #702: Allow filtering when generating patient slices (5970ec0)

v0.172.0 (2024-01-17)

Feature

  • #709: Support temporary uuid in quarantine filter (1d3058a)

v0.171.1 (2024-01-16)

Fix

  • Correct types for id col in splitframe (df7750a)
  • Type ignore invoke 🥷 (8516981)

v0.171.0 (2024-01-15)

Feature

  • Add feature gen specs for text experiment (b4791b8)
  • Add script for embedding text (82713e5)

Fix

  • Correct splitting of text data for sentence transformers (7040115)

v0.170.5 (2024-01-15)

Documentation

Performance

v0.170.4 (2024-01-15)

Fix

v0.170.3 (2024-01-12)

Fix

  • #671: Improve-error-when-cannot-parse-registered-config (#672) (4aa5cff)
  • #671: Improve-error-when-cannot-parse-registered-config (15ca60b)

v0.170.2 (2024-01-12)

Fix

v0.170.1 (2024-01-12)

Fix

  • Literal type hint parsing. Fixes #667. (19dd234)

v0.170.0 (2024-01-12)

Feature

  • Pretty-annotation-extraction-in-confection-placheholder-configs (2ab73a6)

Fix

v0.169.0 (2024-01-11)

Feature

  • Update gain table to be more flexible (9e8cc4d)
  • Update perf by ppr (8d1084c)
  • Update perf by ppr table (5711773)
  • Update performance table (16183d5)
  • Add calibration curve (f5719cf)
  • Map vocab to tfidf indxs in gain table (279f404)
  • Save vocabs from text models (a084da1)

Fix

v0.168.0 (2024-01-10)

Feature

  • Use SplitIDs to generate splits in (issue #593) (590cce6)

Fix

v0.167.0 (2024-01-10)

Feature

  • Fill defaults from function signatures into .cfg (issue #426) (#638) (cae3021)
  • Fill defaults from function signatures into .cfg (issue #426) (a821915)
  • Fill defaults from function signatures into .cfg (issue #426) (f96b839)

v0.166.0 (2024-01-10)

Feature

  • #631: Allow_extra-for-ValidatedFrame (#641) (ade4688)

v0.165.0 (2024-01-10)

Feature

  • SplitLoaders should return SplitFrames (issue #628) (#632) (81983db)

v0.164.3 (2024-01-10)

Fix

  • RegionalFilter should not load at init (7fa1a34)

v0.164.2 (2024-01-10)

Fix

  • Registry testing project-specific registered functions (#637) (fa09dfe)
  • Only check registered functions in common (a27bfd8)
  • Move sczbp specific estimator steps to own registry (bb9981e)

v0.164.1 (2024-01-09)

Fix

v0.164.0 (2024-01-08)

Feature

v0.163.0 (2024-01-08)

Feature

  • Sanitise dict keys for mlflow (fc5eac9)
  • Remove unused extensions (a50d5e7)
  • Disable graphite user pager (dea1c75)

v0.162.0 (2024-01-05)

Feature

  • Joint interface for loading different split ids (b6b2be2)

v0.161.0 (2024-01-04)

Feature

  • Add cleanlab processing step (498403b)
  • Add missforest imputer (05100b3)
  • Synthetic data augmenter estimator step (8712caf)
  • Simple imputer estimator step (1f72094)
  • Standardscaler that infers numeric columns (4be5abc)
  • Imblearn pipeline and constructor (c81f98b)

Fix

v0.160.1 (2024-01-02)

Fix

  • Replace disallowed symbols in config before logging to mlflow (a49faa2)

v0.160.0 (2023-12-28)

Feature

  • Analysis of time from first contact to outcome (14380bd)

Fix

  • Add mlflow logger to registry (ca1a13c)
  • First visit to first contact (3beda39)

v0.159.0 (2023-12-21)

Feature

v0.158.0 (2023-12-20)

Feature

v0.157.0 (2023-12-19)

Feature

Fix

Documentation

v0.156.1 (2023-12-19)

Fix

  • Generate new checkpoint (558666e)

v0.156.0 (2023-12-18)

Feature

  • Convert patient_slice_getters to classes (d2771d4)
  • Convert patient_slice_getters to classes (ec5fe84)

v0.155.0 (2023-12-18)

Feature

  • Pretrain with mlflow (#563) (a0d21ba)
  • Install invoke (e5377a7)
  • Finetune trainer from pretrained checkpoint (#560) (eec5816)
  • Pretraining with mlflow (7033842)
  • Create finetuning-trainer from pretrained checkpoint (62dbb8d)

Fix

v0.154.0 (2023-12-14)

Feature

  • Filter based on insufficient lookahead (#541) (9624405)
  • Filter based on insufficient lookahead (21e545a)
  • Filter based on insufficient lookahead (ab51a6e)

Documentation

v0.153.0 (2023-12-14)

Feature

  • Add SplitDataset generic (d4b99ea)

v0.152.0 (2023-12-14)

Feature

  • Add support for mlflow experiment tracker (#432) (9922000)
  • Add support for mlflow experiment tracker (7048cb0)
  • Add support for mlflow experiment tracker (db8db5b)

v0.151.1 (2023-12-12)

Fix

  • Actually apply all filters (5aaa13c)

Documentation

  • Minor rename and documentation (428d2ff)

Performance

  • Speed up cohort definers (3a7bfab)
  • Use lazyframes for filter prediction times (b9972b2)
  • Speed up cohort definers (fac749a)

v0.151.0 (2023-12-11)

Feature

  • Add washout as an arg during feature gen (82ec387)

v0.150.3 (2023-12-05)

Fix

  • Overwrite log on new run (c60e5d5)

v0.150.2 (2023-12-04)

Fix

  • Correct saving and loading of cfg to disk (ac8531c)
  • Update type hint of BaselineLoggers (1bfc85a)

v0.150.1 (2023-12-04)

Fix

  • Add multilogger to registry (972b929)

v0.150.0 (2023-12-04)

Feature

v0.149.1 (2023-11-30)

Fix

v0.149.0 (2023-11-29)

Feature

v0.148.0 (2023-11-29)

Feature

Fix

v0.147.3 (2023-11-27)

Fix

  • Updated based on new pre-commit (e58adfb)

v0.147.2 (2023-11-27)

Fix

  • Drop entity id column during stratified crossval (#487) (400baa4)

v0.147.1 (2023-11-24)

Fix

  • Log text representation of config (#486) (5866cd2)
  • Log text representation of config (8fc4b91)
  • Log text representation of config (dc1585a)

v0.147.0 (2023-11-24)

Feature

Documentation

  • Small improvements to docstrings (f659c96)
  • Small improvements to docstrings (f2e019b)
  • Add relevant docstring (987f4d8)

v0.146.0 (2023-11-23)

Feature

v0.145.0 (2023-11-23)

Feature

  • Add RegexColumnBlacklist to common (#472) (a34066a)
  • Add RegexColumnBlacklist to common (12f30f9)
  • Add cfg (3fe31ed)
  • Add RegexColumnBlacklist to common (a67b1bc)

v0.144.0 (2023-11-23)

Feature

  • Add support for mlflow experiment tracker (c290c2b)
  • Add vertical concatinator to common (9379854)

v0.143.0 (2023-11-23)

Feature

  • Add TemporalColumnFilter to common (59d3a6d)
  • Add RegexColumnBlacklist to common (b7b66e7)

v0.142.0 (2023-11-21)

Feature

  • Better scaffolding cfg generation (349a5f4)

v0.141.0 (2023-11-20)

Feature

Documentation

v0.140.0 (2023-11-17)

Feature

v0.139.0 (2023-11-17)

Feature

Documentation

  • More info on config update. (5e546e9)

v0.138.0 (2023-11-15)

Feature

v0.137.2 (2023-11-15)

Fix

Documentation

v0.137.1 (2023-11-15)

Fix

  • Ran pre-commit (7ba17a4)
  • Update test to match changes in test patients (814b525)
  • Ran pre-commit (06d849d)
  • Ensure mapping data is only loaded once (389adca)
  • Ran pre-commit (9c7f7db)
  • Updated behrt embedder with filtering (9b3ca61)
  • Update output of create config to ensure correct type (c336a34)
  • Convert output of resolve config to configschema (2771693)

v0.137.0 (2023-11-14)

Feature

v0.136.0 (2023-11-10)

Feature

  • Ensure that evaluation can run with different outcome col names (#418) (c236d58)
  • Support spaces within column names (abfb810)
  • Add support for spaces in header titles (6109822)

v0.135.0 (2023-11-10)

Feature

  • Filter cols by lookbehind combination filter (92b6998)

Fix

v0.134.0 (2023-11-10)

Feature

  • Support spaces within column names (56e1019)
  • Add support for spaces in header titles (52140a1)

v0.133.4 (2023-11-09)

Fix

  • Change BinaryClassificatinoPipeline to take sklearn Pipeline instead of Sequence[ModelStep] (aa5615e)

v0.133.3 (2023-11-09)

Fix

  • Change BinaryClassificatinoPipeline to take sklearn Pipeline instead of Sequence[ModelStep] (b23e6e0)

v0.133.2 (2023-11-09)

Fix

  • Pass pred time uuid to binaryclassification task (078cb92)

v0.133.1 (2023-11-08)

Fix

v0.133.0 (2023-11-07)

Feature

v0.132.0 (2023-11-07)

Feature

v0.131.0 (2023-11-07)

Feature

v0.130.0 (2023-11-07)

Feature

  • Log calculated metric with SplitTrainer (#364) (9759f7c)

v0.129.1 (2023-11-06)

Fix

  • Collect lazyframe and return pl.series (ffca056)

v0.129.0 (2023-11-03)

Feature

  • Performance by lookahead (671f4e9)
  • Different lookaheads (d7538bf)
  • Add script for train test on diff lookaheads (41a29d8)
  • Add script for train test on diff lookaheads (ce8f59f)
  • Add shap (4029cd8)

v0.128.0 (2023-11-01)

Feature

Fix

  • Refactor tasks structure (12f2632)
  • Allow training from overtaci remote desktop (7e9b1d7)
  • Remove warnings (66c529e)
  • Added hotfix for wandb folder during debugging (52fb95e)
  • Error made by pl lightning when saving hp (e1f507d)
  • Added callbacks (983908a)
  • Removed hotfix for behrt embedder (7112773)
  • Fix based on pr comments (4e71238)
  • Undo edit (af04d12)
  • Removed todo comment (17c463a)

v0.127.2 (2023-11-01)

Fix

  • Allow list of data dirs for multirtun (d77186e)
  • Update fa subset feature fns (7197efe)
  • Allow list of data dirs in cfg (064f8cb)
  • Remove redundant quatation marks (f61209f)

v0.127.1 (2023-10-26)

Fix

  • Pydantic requires types to be callable. Removed subscripting of pd.Series. (95d89a6)

v0.127.0 (2023-10-25)

Feature

  • Add cls token to behrt embedder (568224f)

v0.126.0 (2023-10-24)

Feature

Fix

v0.125.0 (2023-10-24)

Feature

Documentation

v0.124.0 (2023-10-23)

Feature

  • Actually use the sliced timeframes for finetuning (52950a5)
  • Add patientslice (77e86bc)

Documentation

v0.123.0 (2023-10-23)

Feature

v0.122.0 (2023-10-23)

Feature

  • Merge multiple feature sets (6526bdd)
  • Test xgboost assumption (5a3222a)
  • Add test of xgboost hyperparams assumption (0768f62)

Fix

v0.121.0 (2023-10-23)

Feature

  • Add devcontainer.json (b9230b4)
  • Allow levels of granularity in diagnosis mapping (a06fd75)
  • Add subsetting script (dcd10ee)

Fix

  • Update train val descriptive comp script (b1f3e72)

Documentation

v0.120.0 (2023-10-18)

Feature

  • Extract runs to functions, to avoid instantiation on import (afc94cb)

Fix

Documentation

  • How to install cuda enabled pytorch on overtaci (25608d2)

v0.119.0 (2023-10-18)

Feature

  • Create plot when training xgboost hba1c only (cd52ec8)

Fix

  • Change typehint for patient colnames (22d9317)
  • Do not import get_best_eval_pipeline unless main (d5da51f)
  • Fixed mutable default error in config (a2d8294)
  • Source subtype filtering works (1259203)

v0.118.0 (2023-10-12)

Feature

  • Add overwrite eval warnings (6d5657f)

v0.117.1 (2023-10-12)

Fix

v0.117.0 (2023-10-12)

Feature

  • Added fine-tuning script (bef7c88)

Fix

v0.116.0 (2023-10-11)

Feature

  • Filter diagnosis subtype in BEHRT (2c56baf)
  • Generate pred timestamps without washout (911076b)

Documentation

  • Added test documentation (448ce4d)

v0.115.0 (2023-10-10)

Feature

  • Add tasks.json (84546d1)
  • Add vscode dev task (8a349b8)
  • Create diagnosis mapping (icd10->caliber) (e27af96)

Fix

v0.114.0 (2023-10-09)

Feature

v0.113.0 (2023-10-06)

Feature

Fix

v0.112.0 (2023-10-03)

Feature

  • Gradient accumulation fix OutOfMemoryError? (7cfc47b)
  • Lr scheduler linear with warm-up (0f2d433)
  • Pretrain version (5e9091c)
  • Ready for training (1218794)
  • Expand test to cover model checkpointing (1f0bece)
  • Lightning module saves hyperparams (59e7ac6)
  • Adapt sequence training script to pytorch lightnign (005cbdf)
  • Initial changes to pytorch lightning module (d919009)
  • Added training script for sequence model (7d2c2bb)

Fix

  • Configs should be initialised with factories (7ddbf9b)
  • Ruff (626e0fc)
  • Fully transitioned to pl (d3a8d32)
  • Replaced print with logging statements (769530f)
  • Make sure parameters is actually moved to the gpu (131e1f0)

Documentation

v0.111.0 (2023-10-03)

Feature

Fix

v0.110.0 (2023-09-26)

Feature

  • Time from first pos pred to next hba1c (b0d805d)
  • Ned script for retraining model with new cv (eba749f)
  • Add feature importance table (423cc23)
  • Add baseline table one (8695e29)
  • Adding eval plots (5b81f26)
  • Add new eval branch (5e8aea3)
  • Wip new eval structure (8728545)

Fix

v0.109.0 (2023-09-14)

Feature

  • Implement classifierchain (097cc0a)

Fix

  • Unpack dataframe to series in eval df (dcf9b93)

v0.108.0 (2023-09-12)

Feature

  • Main test passes 🥳 (622adf7)
  • Add missing methods from PSYCOPModule to BEHRTForMaskedLM (79f8a26)
  • Update trainer to match checkpoint savers (ab9a234)
  • Add wandb logger (7eef76a)
  • Flesh out trainining (40b1032)
  • Add dataclass-based vocab (25f7d3e)
  • Implemented masking task (b0ffbf4)
  • Embedder skeleton (81479c7)

Fix

  • Fix error from static type checks (13f5a38)
  • Updated format of the mask function to allow for testing (d49dadc)
  • Make sure that the tests test the outer masking_fn (679cd54)
  • Renamed PsycopModule -> TrainableModule (fadc9a1)
  • Remove testing assumption from Logger (861c162)
  • Updated logger to handle allow logging configs seperately (604dc2e)
  • Moved logger interface to its own script (4c83ff4)
  • Added vocab_size (3789834)
  • Added type hints (370aebe)
  • Forward pass in embedding module works (5cd38d9)
  • Added patient dataset (2a00c32)
  • Added behrt embedder (a2bbd8b)

Documentation

  • Removed old comments (74ca7d3)
  • Removed unnecessary comment (6f8b175)

v0.107.0 (2023-09-04)

Feature

Fix

  • Remove use of hba1c in cvd filters (395487e)
  • Unneeded newline handling (311d15d)
  • Strip lines of whitespace before generating dataframes (628db49)

v0.106.0 (2023-08-31)

Feature

Fix

  • Possibly unbound variable (864f59b)

Documentation

  • Point to patient object tests (d727664)

v0.105.0 (2023-08-30)

Feature

Fix

  • Config of last model (250d854)
  • Naming (5bca53b)
  • Configurations for new tfidf feat set (fe33f0f)
  • Update configurations of model train and eval (dd0f83c)
  • Reconfigure text lookbehinds (f6151ad)
  • Text specs (a0ac8bf)

v0.104.0 (2023-08-24)

Feature

  • Parse date of birth to all patients (c469997)

v0.103.0 (2023-08-23)

Feature

  • Train new tfidf model and encode text (1fdd3ce)

v0.102.1 (2023-08-23)

Fix

  • Don't shadow python builtin (c05d307)

v0.102.0 (2023-08-23)

Feature

Fix

  • Rename from merge (38257e4)
  • Type checking block for circular imports (a33a8de)
  • Typo in shak codes (e2cd184)

v0.101.0 (2023-08-23)

Feature

  • Convert getters to properties (cbe130b)
  • Handle lookahead-based outcome resolution (489003f)
  • Remove patient_ids and fix downstream type consequences (2918452)
  • Misc. (bd3d6ea)
  • First working unpacker (c65219b)
  • First stab at unpacking to patient dfs (74ba12a)
  • Filter prediction sequences (402eee9)

Fix

Documentation

  • Add comments explaining eq (d99c0be)

v0.100.0 (2023-08-22)

Feature

  • Filename check earlier for feature-gen (17b3404)

v0.99.0 (2023-08-22)

Feature

  • Cohort creation for the cancer project (4a408b9)

Fix

  • Correct type hints for aggregation (ac9fc29)
  • Reconfigure lab tests (caebc33)
  • Replaced unsued function (75f241c)

Documentation

v0.98.0 (2023-08-22)

Feature

  • Add sentence transformers features (8a91048)
  • Adding text specs (27b37ac)

Fix

v0.97.0 (2023-08-15)

Feature

  • Add new dir param and user prompt (fe46dbf)

Fix

  • Broken tests due to missing arg (74cd065)
  • Add arg to general function (6395109)
  • Update general function (81506ee)
  • Instructions in README.md (c2de501)

v0.96.0 (2023-08-11)

Feature

Fix

v0.95.0 (2023-08-10)

Feature

  • First stab at chunked feature gen (0a31a61)
  • Add loader for embedded text (40c8271)
  • Train sentence transformer code (055a572)
  • Sentence transformer embedding ready to train (40ce08a)
  • Sentence transformer embedding (9b998f8)
  • Vis qc (ec8ff15)

Fix

  • Misc (df49d9b)
  • Ignore old import erros (d6a2105)
  • Reinstate 'prefixes_to_describe' param (761c6f0)
  • Remove old param (4240440)
  • Minor changes and typos (2341abd)
  • Typo in requirements (d97aefe)
  • Text feat specs resolve mltp to mean (9dd1065)
  • Paths (b008a95)
  • Change chunking pipeline (781d442)
  • Updating scz_bp feature gen (8b1ee14)
  • Move chunk tests (df7834d)
  • Move chunk tests (295e4c0)
  • Don't modify prediction_times_df in PredictionTimeFilterer (e1eae8c)
  • Type hint for ColNames (0454b42)
  • Chunked feature gen (d8e2ac7)
  • Set wandb to offline during feature gen (5ce32be)
  • Print time taken for sentence embedding (ac4e8a2)

v0.94.0 (2023-08-03)

Feature

  • Multilabel classification (5285b2c)

v0.93.0 (2023-07-20)

Feature

Fix

v0.92.0 (2023-07-18)

Feature

  • To polars|pandas method for EvalDataset + fixed threshold (52fe185)
  • Add loader for first visit to psychiatry (9ee9727)

Fix

  • Set pythonpath for interactive session (ded552d)

v0.91.0 (2023-07-18)

Feature

  • Change prefix for supplementary outputs (9b2a15d)

v0.90.0 (2023-07-11)

Feature

  • Only print failed checks if there are any (ba645ec)
  • Only do feature description of columns matching prefix (dc45ab7)

v0.89.1 (2023-07-07)

Fix

  • Add birthdays as default (17bdc01)

v0.89.0 (2023-07-06)

Feature

  • Add loader for therapeutic leave (a9f95ec)

v0.88.0 (2023-07-03)

Feature

  • Freeze DataframeBundles (8a1f29b)
  • First stab at types and tests for sequence windower (a68afd7)

Fix

  • Correct types for aggregation funcs in t2d specify features (6796f0a)

Documentation

  • Add docs to eventcolumns (e6585a2)
  • Explain sequence columns (be1f89d)
  • Define behaviour if lookbehind is none (26a7f0c)
  • Add docstring (fe5fd2e)

v0.87.0 (2023-06-28)

Feature

v0.86.0 (2023-06-28)

Feature

Fix

v0.85.1 (2023-06-27)

Fix

  • Remove duplicate csv (10b98f7)
  • Merge over correct fa eval files (879e166)

v0.85.0 (2023-06-20)

Feature

  • Add plot code (8c0b347)
  • Remove name and build-system to avoid pip install -e . (63e871b)
  • Migrate to requirements.txt (cab38a5)

Fix

Documentation

v0.84.0 (2023-06-20)

Feature

Fix

Documentation

v0.83.0 (2023-06-19)

Feature

  • Eval pipeline works (a663436)
  • Add typehints to feature specs (bf42de0)
  • Turn wandb off for now in main feature_gen script (2ea9ef3)
  • Cancer project initial setup (de8f9cf)

Fix

v0.82.0 (2023-06-14)

Feature

v0.81.0 (2023-06-13)

Feature

  • Move markdown handling to common (bdfeafa)

v0.80.0 (2023-06-07)

Feature

Fix

  • Guard for newly optional configs (e9ff39e)

v0.79.1 (2023-06-06)

Fix

  • Remove project specific md code (945a0fd)

v0.79.0 (2023-06-02)

Feature

  • Simplify feature describer (a9f9f7b)

v0.78.1 (2023-06-01)

Fix

  • Patchwork grid of size 1 (b159a10)

v0.78.0 (2023-06-01)

Feature

  • Increase size of axis labels in t2d pn theme (71b8dd0)
  • Increase size of patchwork subpanel labels (384e06d)
  • Make HbA1c only configurable (d2854a8)
  • Adopt boolean dataset to featuremodifier (5188047)
  • Ignore static type checks on Ovartaci (840c015)
  • Allow disabling of column name checks (ad519be)
  • Boolean cols in place (8b968e5)
  • Use native polars column selection (ef25f17)

Fix

Documentation

v0.77.1 (2023-05-30)

Fix

  • Correct lookbehind selection (3807a94)

v0.77.0 (2023-05-26)

Feature

  • Implement full supplementary generation (530d972)
  • Switch to TDD for md_object generation (35b4787)
  • Create required wandb folder when initialising wandb in WandbHandler (41037d9)
  • Misc. (0a54195)
  • Eval run on test_set (76644ee)

Fix

  • Align plot and table for median warning days (bda3eed)

v0.76.0 (2023-05-24)

Feature

  • Automatic robustness figure (07f9f2c)
  • Abstract robustness plots (514226b)

Fix

  • Ensure X_by_group returns a dataframe (57b3160)
  • Ensure X_by_group returns a dataframe (efab826)

v0.75.1 (2023-05-24)

Fix

  • Pin wandb version to avoid failing on tests (2a92dda)

v0.75.0 (2023-05-23)

Feature

  • Generate a publication-ready performance_by_ppr table (32c20ed)

v0.74.0 (2023-05-23)

Feature

  • Add thousand separator to conf matrix (fc9b6dc)
  • Add thousand separator to conf matrix (4f98c0b)
  • Add thousand separator to plotnine conf matrix (d27d0f7)
  • Add lines to sens by time to event (b739267)
  • First stab at sens by time to event plot (91571f9)
  • Add full performance figure (45c1d6e)

Fix

  • Do not check for venv for tests, conflicts with CI (672d43f)
  • Handle uneven number of plots in patchwork_grid (a833eb7)

v0.73.0 (2023-05-17)

Feature

  • Convert auroc to plotnine (80f5cbf)

Fix

v0.72.0 (2023-05-17)

Feature

  • Create plotnine confusion matrix (5045b67)

Fix

  • Handle trailing commas in str_to_df (71331cf)
  • Incorrect git import (629631d)

Documentation

v0.71.0 (2023-05-17)

Feature

  • Print a4 conversion factor (f6cde36)
  • Add patchwork grid functionality (4041220)

Fix

  • Autofix when creating pr (76470cd)

v0.70.0 (2023-05-16)

Feature

v0.69.0 (2023-05-16)

Feature

v0.68.0 (2023-05-16)

Feature

  • Split ci after bootstrap (f3e4f6f)

v0.67.1 (2023-05-15)

Fix

Documentation

v0.67.0 (2023-05-12)

Feature

  • Create pipeline and unified interface for evaluating the best run (d4fd7f3)

v0.66.0 (2023-05-11)

Feature

Documentation

  • Better explain utility func (46396a1)

v0.65.0 (2023-05-11)

Feature

  • Add ci to timedelta plots (ce8c63f)

v0.64.0 (2023-05-11)

Feature

  • Handle only one true class (5a90247)

v0.63.0 (2023-05-09)

Feature

  • Increase x-axis text size for base plots (b5ddf0b)

v0.62.1 (2023-05-05)

Fix

  • Missing polars requirement (8e277e1)

v0.62.0 (2023-05-05)

Feature

v0.61.0 (2023-05-03)

Feature

  • Allow custom splits for training (6e0bf71)

v0.60.1 (2023-05-03)

Fix

  • Get correct performance by ppr (09fa471)
  • Get correct performance by ppr (df468ea)

v0.60.0 (2023-05-03)

Feature

Fix

  • Do not support multiclass in calc_performance (781692b)
  • Assign sql cache if on local (2365b65)
  • Assign sql cache if on local (d57c9fd)

Documentation

v0.23.0 (2023-04-26)

Feature

  • Add logging and choose sfi types (d5f8e23)
  • Create example scripts (76e063a)
  • Initial text model pipelines (1934db0)
  • Add tests (d7a8bab)
  • Initial simple preprocessing pipeline for all sfis (f941a4d)
  • Add include_sfi_name in load_text_split (4605c88)
  • Include_sfi_name arg (58baf9a)
  • Fit and load tfidf, bow, and lda models (3d33d9b)

Fix

  • Preprocess to one regex (c716653)
  • Remove symbols again (1210b7e)
  • Based on HLasses comments (32da48f)
  • Insert model type in filename (1457387)
  • Add doc strings to preprocessing functions (4e27650)
  • Remove log.info and small fixes (84f3cc3)
  • Ruff fixes (ea9c564)
  • Return vectorizer and matrix + clean-up (e1c48a0)
  • Query string (cb7424c)
  • Naming and doc string update (141e52a)
  • General clean-up and change corpus in fit functions to list (22b6a9e)
  • Change ngram default and clean-up (387f845)
  • Small fixes to logging (c3a3f53)
  • Remove old comments (4b88514)
  • Change view name (a9bb0fc)
  • Move save_text_model_to_dir to utils (469df3b)
  • Move save_text_model_to_dir to utils (26a80d2)
  • Renaming in preprocessing (c381768)
  • Remove stop_words arg and return models (3d29012)
  • Change arg path to path_str (f781a74)
  • Enable multiple splits when loading data + add n_rows arg (8ae2d2e)
  • Remove Path from arg (29b442b)

v0.22.0 (2023-04-24)

Feature

  • Add feature descriptions for text features (84c696a)

Documentation

v0.21.4 (2023-04-04)

Fix

  • Remove unreasonably high or low bmi values (07f52c2)

v0.21.3 (2023-04-03)

Fix

  • Make sql query executable (e006490)
  • Str turned into list of characters instead of list of words (0fae478)

v0.21.2 (2023-03-27)

Fix

  • Add unpack args to skema 2 wo nutrition (95c35c8)

v0.21.1 (2023-03-22)

Fix

  • Only keep weights above 0.5 kg (8a5a104)
  • Do not load invalid weights (7be4653)

v0.21.0 (2023-03-22)

Feature

  • Support new pipe annotation (a1bde17)

Fix

v0.20.3 (2023-03-14)

Fix

  • Set unpack_to_intervals to default (64391ca)
  • Remove unintended space (9c6cd33)

v0.20.2 (2023-03-14)

Fix

  • Add skema_2_without_nutrition again (685c5cb)

v0.20.1 (2023-03-11)

Fix

  • Cruft github action (c8f6278)
  • Bug in cruft action (ec8267a)
  • Remove psycop-ml-utils, no longer exists (d8fbb65)

v0.20.0 (2023-03-09)

Feature

  • Add more glc loaders (b765e77)
  • Add type 1 diabetes loaders (b682984)
  • Make sql loader verbose (602f4f3)
  • Add caching to sql_load (a68c15d)
  • Ibid (46da732)
  • Add support for keeping code col when loading diagnoses (51ca63e)
  • Add t2d diagnosis loading (6b8231c)
  • Add ogtt (f6c07a9)
  • Update current blood sugar measurements (5e8051a)

Fix

  • Lacking prefix on loading glc (d9bdbcb)
  • Inappropriate matching (e2409ed)
  • Poetry formatted dependencies (125500a)

v0.19.2 (2023-03-06)

Fix

v0.19.1 (2023-03-06)

Fix

  • Drop rows with NaT (5a1d908)
  • Round timestamps to whole seconds befor droppig duplicates (e503bf3)

v0.19.0 (2023-03-03)

Feature

  • Add option for which timestamp to get when loading physical visits (ef369b8)

Fix

  • Drop duplicates in the output_df (636cc48)
  • Don't load duplicate visits (5028b1d)
  • Physical visits should only load physical visits (b7c50cf)
  • Did not rename to timestamp before returning (f43522c)

v0.18.4 (2023-02-22)

Fix

  • Loader names still too long (3321b88)

v0.18.3 (2023-02-22)

Fix

  • Loader names too long for wandb (cc14da2)

v0.18.2 (2023-02-21)

Fix

v0.18.1 (2023-02-15)

Fix

  • Adjust function for saving integrity checks (de2577e)
  • Restructure overarching description func (54c24a2)

Documentation

  • Better function description (7eb9e54)

v0.18.0 (2023-02-14)

Feature

  • Add arg for choosing timestamp and add warning (159a176)

v0.17.2 (2023-02-13)

Fix

  • Make naming scheme consistent (c125b48)
  • Attempted rename of unspecified df (c266bd8)
  • Revert logic (ad110ee)
  • Quarantine_df and quarantine_days can be left as None (f130370)

v0.17.1 (2023-02-10)

Fix

  • Allowed types works again (dbe75ca)
  • All arg names now congruent, visit_types takes a list of visit types instead of string (e63e9d4)

v0.17.0 (2023-02-09)

Feature

v0.16.1 (2023-01-31)

Fix

  • Use acute outpatient visits as well (659af23)
  • Typo, and use newest data (bbbc8f5)
  • Use end dates for all contacts (d8940c1)
  • Use end times for all diagnosis loading (4d9e600)

v0.16.0 (2023-01-27)

Feature

  • Remove try/except to avoid debugger getting stuck on it (3884ab8)

Fix

  • Move all str operations into the if statement (91f9174)

v0.15.0 (2022-12-19)

Feature

  • Move logs next to their dataset (e0ed033)

Documentation

  • Improve quarantine docs (1b23f19)

v0.14.0 (2022-12-16)

Feature

  • Name wandb project_name-feature-generation (b601d80)

v0.13.0 (2022-12-16)

Feature

  • Improve logging in flatten_dataset (63f252f)
  • Enable minimum specificaitons (669e3ed)
  • Enable minimum specificaitons (523cfd1)
  • Log rows dropped by PredictionTimeFilterer (7e02d8e)
  • Add moves loader (0521dd0)
  • First stab at loader (f9048b8)

Fix

  • Add pred_time_uuid if not specified when filtering (acca5b9)

Performance

  • Avoid groupby in filter_prediction_times (a66e361)

v0.12.0 (2022-12-15)

Feature

  • Add rows dropped logging (33ba525)
  • Allow filtering based on quarantine dates (3deb052)
  • Improve logging - debug to file, info to stdout (aff10a9)
  • Move wandb init earlier so wandb_alerts can cover values_df loading (6c153b1)
  • Generate full feature set (9ba907a)
  • Wrap as much of main as possible in wandb exception (3b085af)
  • Allow timestamps only return from visit loaders for use as pred_times (f9534e0)
  • Migrate some loaders to logging. (f81fd92)
  • More explicit logging (7969210)
  • Init changes (f257daa)

Fix

  • Use lookbehind instead of interval days (7e14ad5)
  • Only one feature cache per project (cb0b8b0)
  • Unused input args (fa14461)
  • Wandb util was missing text kwarg (64c1729)

Performance

  • Infer CPU cores from logical cores (309e9d2)

v0.11.0 (2022-12-13)

Feature

  • Add wandb alert on exception (3ff6e37)

Documentation

v0.10.0 (2022-11-21)

Feature

  • Add n_hba1c_within_n_lookahead_days (e84b591)
  • Add outcome (cd39dd6)
  • Add birth year as a predictor (7b186d2)
  • Allow exclusion of specific atc codes (75619a1)

Fix

  • Date of birth col name should respect output prefix (6ec6535)
  • Incorrect column name when adding age as predictor (cdbf25c)
  • Errors in sql loaders after refactor (28c9f63)
  • Correct type hinting in load_diagnoses (f2d5c5b)

Documentation

  • Speccify that n_rows = None returns all rows. (a4720a8)

Performance

  • Shuffle feature specs to even out compute vs. IO load (0db9f0f)
  • Tweak n_workers for more performance (3eeee4d)
  • Segment feature loading for more parallelisation (9ee5c87)
  • Rotate feature addition for debugging (76af9c7)
  • Parallelise temporal predictor loading (8d53f16)
  • Only create one subprocess per values loader (1a3e5de)
  • Parralelise groupspec combination creation (9ccba2a)

v0.9.0 (2022-11-18)

Feature

  • At groupspec init, iterate over values_loader and check that they exist in the loader registry (04dfd7e)

Fix

  • More explanation in error message (b784991)
  • Bettee valueerror message formatting (7b3b994)
  • Better valueerror message (d92f798)
  • Find invalid loaders (ba2d4c5)

v0.8.0 (2022-11-17)

Feature

  • Allow load_medications to concat a list of medications (d78f465)

Fix

  • Remove original functions (da59110)

Documentation

v0.7.0 (2022-11-16)

Feature

  • Full run (142212f)
  • Rename resolve_multiple registry keys to their previous one (3fd3f35)
  • Reimplement (c99585f)
  • Use lru cache decorator for values_df loading (4006818)
  • Add support for loader kwargs (127f821)
  • Move values_df resolution to AnySpec object (714e83f)
  • Make date of birth output prefix a param (0ed1198)
  • Ensure that dfs are sorted and of same length before concat (84a4d65)
  • Use pandas with set_index for concat (b93290a)
  • Use pandas with set_index for concat (995da41)
  • Speed up dask join by using index (3402281)
  • Require feature name for all features, ensures proper specification (6af454a)
  • First stab at adapting generate_main (7243130)
  • Add exclusion timestamp (b02de1a)
  • Improve dd.concat (429da34)
  • Handle strs for generate_feature_spec (7d54488)
  • Convert to dd before concat (06101d8)
  • Add n hba1c (3780d84)
  • Add n hba1c (614245e)

Fix

  • Coerce by default (60adb99)
  • Output_col_name_override applied at loading, not flattening (95a96ce)
  • Typo (01240ed)
  • Incorrect attribute addressing (a6e82b5)
  • Correctly resolve values_df (def67cd)
  • MinGroupSpec should take a sequence of name to permute over (f0c8140)
  • Typo (61c7241)
  • Remove resolve_multiple_fn_name (617d386)
  • Old concat resulted in wrong ordering of rrows. (3759f71)
  • Set hba1c as eval (89fe6d2)
  • Typos (6eac440)
  • Correct col name inference for static predictors (dfe5dc7)
  • Misc. fixes (45f8348)
  • Generate the correct amount of combinations when creating specs (c472b3c)
  • Typo resulted in cache breaking (fdd47d7)
  • Correct col naming (bc74ae3)
  • Do not infer feature name from values_df (150569f)
  • Misc. errors found from tests (3a1b5db)
  • Revert falttened dataset to use specs (e4fada7)
  • Misc. errors after introducing feature specs (0308eca)
  • Correctly merge dataframes (a907885)
  • Cache error because of loss off UUID (89d7f6f)
  • New bugs in resolve_multiple (5714a39)
  • Rename outcomespec appropriately (41fa220)
  • Lookbehind_days must be iterable (cc879e9)

Documentation

Performance

  • Move pd->dd into subprocesses (dc5f38d)

v0.6.3 (2022-10-18)

Fix

  • Remove shak_code + operator check (f97aee8)

v0.6.2 (2022-10-17)

Fix

  • Ignore cat_features (2052505)
  • Failing test (f8190b4)
  • Incorrect 'latest' and handling of NaN in cache (dc33f7e)

v0.6.1 (2022-10-13)

Fix

  • Check for value column prediction_times_df (5356464)
  • Change variable name (990a848)
  • More flex loaders (bcad700)

v0.6.0 (2022-10-13)

Feature

  • Use wandb to monitor script errors (67ae9b9)

Fix

  • Duplicate loading when pre_loading dfs (7f864dc)

v0.5.2 (2022-10-12)

Fix

v0.5.1 (2022-10-10)

Fix

  • Change_per_day functions (d696389)
  • Change_per_day function (4c8c118)

v0.5.0 (2022-10-10)

Feature

  • Add variance to resolve multiple functions (8c471df)

Fix

  • Add vairance resolve multiple (7a64c5b)

v0.4.4 (2022-10-10)

Fix

  • Deleted_irritating_blank_space (a4cdfc5)

v0.4.3 (2022-10-10)

Fix

  • Auto inferred cat features (ea0d946)
  • Auto inferred cat features error (f244715)
  • Resolves errors caused from auto cat features (667a905)

v0.4.2 (2022-10-06)

Fix

  • Incorrect function argument (33e0a3e)
  • Expanded test to include outcome, now passes locally (640e7ec)
  • Passing local tests (6ed4b2e)
  • First stab at bug fix (339d793)

v0.4.1 (2022-10-06)

Fix

  • Add parents to wandb dir init (5eefe3a)

v0.4.0 (2022-10-06)

Feature

Fix

  • Refactor feature spec generation (17e9f16)
  • Align arguments with colnames in SQL (09ae5f7)
  • Refactor feature specification (373b0f0)

v0.3.2 (2022-10-05)

Fix

v0.3.1 (2022-10-05)

Fix

  • Mismatched version in .tomll (292979b)

v0.3.0 (2022-10-05)

Feature

Fix

  • Pass value_col only when necessary (dc1019f)
  • Pass value_col (4674e4a)
  • Don't remove NaNs, might be informative. (1ad5d81)
  • Remove parquet default argument except in top level functions (ec3a98b)
  • Align .toml and release version (80adbde)
  • Failing tests (b5e4321)
  • Incorrect feature sets path, linting (605ccb7)
  • Handle dicts for duplicate checking (34524c0)
  • Check for duplicates in feature combinations (63ad162)
  • Remove duplicate alat key which prevented file saving (f0c3e00)
  • Incorrect argumetn (b97d54b)
  • Linting (7406288)
  • Use suffix instead of string parsing (cfa96f0)
  • Refactor dataset loading into a separate function (bca8cbf)
  • More migration to parquet (f1bc2b7)
  • Mark hf embedding test as slow, only run if passing --runslow to pytest (0e03395)

v0.2.4 (2022-10-04)

Fix

  • Wandb not logging on overtaci. (3baab57)

v0.2.3 (2022-10-04)

Fix

  • Use dask for concatenation, increases perf (4235f5c)

v0.2.2 (2022-10-03)

Fix

  • Use pypi release of psycopmlutils (5283b05)

v0.2.1 (2022-10-03)

Fix

v0.2.0 (2022-09-30)

Feature

  • Add test for chunking logic (199ee6b)

Fix

v0.1.0 (2022-09-30)

Feature

Fix

  • Force dtype for windows (2e6e8bf)
  • Linting (5cdfcfa)
  • Pre code-split import statements need to be updated (a9e0639)
  • Misspecified python version in action (fdde2d2)

v0.0.1 (2023-03-30)

Fix