Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Flagging system for CHC U071 System #1626

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 26 additions & 10 deletions _delphi_utils_python/delphi_utils/weekday.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,15 @@ class Weekday:
"""Class to handle weekday effects."""

@staticmethod
def get_params(data, denominator_col, numerator_cols, date_col, scales, logger):
r"""Fit weekday correction for each col in numerator_cols.
def get_params(data, denominator_col, numerator_cols, date_col, scales, logger, lm=10):
"""Fit weekday correction for each col in numerator_cols.

Return a matrix of parameters: the entire vector of betas, for each time
series column in the data.

If you want to weekday correction on counts (and not ratios), set denominator_col to None
"""
tmp = data.reset_index()
denoms = tmp.groupby(date_col).sum()[denominator_col]
nums = tmp.groupby(date_col).sum()[numerator_cols]

# Construct design matrix to have weekday indicator columns and then day
Expand All @@ -30,21 +31,27 @@ def get_params(data, denominator_col, numerator_cols, date_col, scales, logger):
X[np.where(nums.index.dayofweek == 6)[0], :6] = -1
X[:, 6:] = np.eye(X.shape[0])

npnums, npdenoms = np.array(nums), np.array(denoms)
npnums = np.array(nums)
params = np.zeros((nums.shape[1], X.shape[1]))

npdenoms = None
if denominator_col is not None:
denoms = tmp.groupby(date_col).sum()[denominator_col]
npdenoms = np.array(denoms)

# Loop over the available numerator columns and smooth each separately.
for i in range(nums.shape[1]):
result = Weekday._fit(X, scales, npnums[:, i], npdenoms)
result = Weekday._fit(X, scales, npnums[:, i], npdenoms, lm)
if result is None:
logger.error("Unable to calculate weekday correction")
params[i, :] = result
else:
params[i,:] = result

return params

@staticmethod
def _fit(X, scales, npnums, npdenoms):
def _fit(X, scales, npnums, npdenoms, lm):
r"""Correct a signal estimated as numerator/denominator for weekday effects.

The ordinary estimate would be numerator_t/denominator_t for each time point
Expand Down Expand Up @@ -78,16 +85,25 @@ def _fit(X, scales, npnums, npdenoms):

ll = (numerator * (X*b + log(denominator)) - sum(exp(X*b) + log(denominator)))
/ num_days

If you want to weekday correction on counts (and not ratios), remove the denom terms
from ll and set npdenoms to None.
"""
b = cp.Variable((X.shape[1]))

lmbda = cp.Parameter(nonneg=True)
lmbda.value = 10 # Hard-coded for now, seems robust to changes
lmbda.value = lm # Hard-coded for now, seems robust to changes

ll = (
cp.matmul(npnums, cp.matmul(X, b) + np.log(npdenoms)) -
cp.sum(cp.exp(cp.matmul(X, b) + np.log(npdenoms)))
) / X.shape[0]
cp.matmul(npnums, cp.matmul(X, b)) -
cp.sum(cp.exp(cp.matmul(X, b)))
) / X.shape[0]

if npdenoms is not None:
ll = (
cp.matmul(npnums, cp.matmul(X, b) + np.log(npdenoms)) -
cp.sum(cp.exp(cp.matmul(X, b) + np.log(npdenoms)))
) / X.shape[0]
# L-1 Norm of third differences, rewards smoothness
penalty = lmbda * cp.norm(cp.diff(b[6:], 3), 1) / (X.shape[0] - 2)
for scale in scales:
Expand Down
22 changes: 22 additions & 0 deletions chng_flags/.pylintrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@

[MESSAGES CONTROL]

disable=logging-format-interpolation,
too-many-locals,
too-many-arguments,
# Allow pytest functions to be part of a class.
no-self-use,
# Allow pytest classes to have one test.
too-few-public-methods

[BASIC]

# Allow arbitrarily short-named variables.
variable-rgx=[a-z_][a-z0-9_]*
argument-rgx=[a-z_][a-z0-9_]*
attr-rgx=[a-z_][a-z0-9_]*

[DESIGN]

# Don't complain about pytest "unused" arguments.
ignored-argument-names=(_.*|run_as_module)
29 changes: 29 additions & 0 deletions chng_flags/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
.PHONY = venv, lint, test, clean

dir = $(shell find ./delphi_* -name __init__.py | grep -o 'delphi_[_[:alnum:]]*' | head -1)
venv:
python3.8 -m venv env

install: venv
. env/bin/activate; \
pip install wheel ; \
pip install -e ../_delphi_utils_python ;\
pip install -e .

install-ci: venv
. env/bin/activate; \
pip install wheel ; \
pip install ../_delphi_utils_python ;\
pip install .

lint:
. env/bin/activate; pylint $(dir)
. env/bin/activate; pydocstyle $(dir)

test:
. env/bin/activate ;\
(cd tests && ../env/bin/pytest --cov=$(dir) --cov-report=term-missing)

clean:
rm -rf env
rm -f params.json
62 changes: 62 additions & 0 deletions chng_flags/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Change Flagging System

## Running the System

The system is run by directly executing the Python module contained in this
directory. The safest way to do this is to create a virtual environment,
installed the common DELPHI tools, and then install the module and its
dependencies. To do this, run the following code from this directory:

```
make install
```

This command will install the package in editable mode, so you can make changes that
will automatically propagate to the installed package.

All of the user-changable parameters are stored in `params.json`. A template is
included as `params.json.template`.

Uniquely to this project, ensure all your parameters are sensibe and input/output
folders exist with the desired files in them.

```
env/bin/python -m delphi_chng_flags
```

If you want to enter the virtual environment in your shell,
you can run `source env/bin/activate`. Run `deactivate` to leave the virtual environment.

Once you are finished, you can remove the virtual environment and
params file with the following:

```
make clean
```

## Testing the code

To run static tests of the code style, run the following command:

```
make lint
```

Unit tests are also included in the module. To execute these, run the following
command from this directory:

```
make test
```

To run individual tests, run the following:

```
(cd tests && ../env/bin/pytest <your_test>.py --cov=delphi_chng_flags --cov-report=term-missing)
```

The output will show the number of unit tests that passed and failed, along
with the percentage of code covered by the tests.

None of the linting or unit tests should fail, and the code lines that are not covered by unit tests should be small and
should not include critical sub-routines.
39 changes: 39 additions & 0 deletions chng_flags/REVIEW.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
## Code Review (Python)

A code review of this module should include a careful look at the code and the
output. To assist in the process, but certainly not in replace of it, please
check the following items.

**Documentation**

- [x] the README.md file template is filled out and currently accurate; it is
possible to load and test the code using only the instructions given
- [x] minimal docstrings (one line describing what the function does) are
included for all functions; full docstrings describing the inputs and expected
outputs should be given for non-trivial functions

**Structure**

- [x] code should use 4 spaces for indentation; other style decisions are
flexible, but be consistent within a module
- [TODO] any required metadata files are checked into the repository and placed
within the directory `static`
- [TODO] any intermediate files that are created and stored by the module should
be placed in the directory `cache`
- [x] final expected output files to be uploaded to the API are placed in the
`receiving` directory; output files should not be committed to the respository
- [x] all options and API keys are passed through the file `params.json`
- [x] template parameter file (`params.json.template`) is checked into the
code; no personal (i.e., usernames) or private (i.e., API keys) information is
included in this template file

**Testing**

- [x] module can be installed in a new virtual environment
- [x] pylint with the default `.pylint` settings run over the module produces
minimal warnings; warnings that do exist have been confirmed as false positives
- [TODO] reasonably high level of unit test coverage covering all of the main logic
of the code (e.g., missing coverage for raised errors that do not currently seem
possible to reach are okay; missing coverage for options that will be needed are
not)
- [TODO] all unit tests run without errors
1 change: 1 addition & 0 deletions chng_flags/cache/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.csv
14 changes: 14 additions & 0 deletions chng_flags/delphi_chng_flags/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# -*- coding: utf-8 -*-
"""Module to pull and clean from CHC source flagging.

This file defines the functions that are made public by the module. As the
module is intended to be executed though the main method, these are primarily
for testing.
"""

from __future__ import absolute_import

from . import pull
from . import run

__version__ = "0.0.0"
12 changes: 12 additions & 0 deletions chng_flags/delphi_chng_flags/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# -*- coding: utf-8 -*-
"""Call the function run_module when executed.

This file indicates that calling the module (`python -m MODULE_NAME`) will
call the function `run_module` found within the run.py file. There should be
no need to change this template.
"""

from delphi_utils import read_params
from .run import run_module # pragma: no cover

run_module(read_params()) # pragma: no cover
Loading