Skip to content

Commit

Permalink
Merge pull request #35 from datakind/develop
Browse files Browse the repository at this point in the history
Prepare v0.1 release for publishing
  • Loading branch information
bdewilde authored Dec 8, 2024
2 parents c2599a0 + 9169ba3 commit 1956432
Show file tree
Hide file tree
Showing 91 changed files with 15,965 additions and 202 deletions.
6 changes: 6 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# ref: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners

pdp/ @bdewilde @kaylawilding @vishpillai123
zogotech/ @anzhely
modeling/ @bdewilde
# todo: target bias code => who ??
14 changes: 9 additions & 5 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
Fixes #
<!--- Provide a brief description of your changes in the title above. -->

## Proposed Changes
## changes
<!--- Describe your changes in detail, to guide reviewers through the git diff. -->

-
-
-
## context
<!--- Why are these change required? What problem does it solve? -->
<!--- If this fixes an open issue / is ticketed, put the link(s) here! -->

## questions
<!--- Ask any specific questions that you'd like reviewers to address. -->
23 changes: 23 additions & 0 deletions .github/actions/setup-python-env/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: "Set Up Python Environment"

inputs:
python-version:
description: "Python version to use"
required: true

runs:
using: composite
steps:
- name: Install python
uses: actions/setup-python@v5
with:
python-version: ${{ inputs.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
version: "0.5.4"
enable-cache: true
cache-dependency-glob: "uv.lock"
- name: Install project and dependencies
run: uv sync --frozen --dev
shell: bash
52 changes: 0 additions & 52 deletions .github/workflows/ci.yml

This file was deleted.

26 changes: 26 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: lint

on:
pull_request: # any pull request

jobs:
lint:
runs-on: ubuntu-latest
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Set up Python environment
uses: ./.github/actions/setup-python-env
with:
python-version: "3.10"
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v45
with:
files: |
**.py
**.ipynb
- name: Check style
if: steps.changed-files.outputs.any_changed == 'true'
run: |
uv tool run ruff check ${{ steps.changed-files.outputs.all_changed_files }}
24 changes: 24 additions & 0 deletions .github/workflows/pre-release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: pre-release

on:
# pull request targeting main branch
pull_request:
branches: [main]

jobs:
check-changelog:
runs-on: ubuntu-latest
permissions:
pull-requests: read
steps:
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v45
with:
files: |
CHANGELOG.md
- name: Ensure changelog updated
if: steps.changed-files.outputs.any_changed == 'false'
run: |
echo "CHANGELOG.md file must be updated with release notes"
exit 1
34 changes: 34 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: publish

on:
release:
types: [published]

jobs:
publish:
runs-on: ubuntu-latest
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Set up Python environment
uses: ./.github/actions/setup-python-env
with:
python-version: "3.10"
- name: Build package
run: |
uv build --python "3.10"
- name: Publish package to TestPyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
repository-url: https://test.pypi.org/legacy/
user: __token__
password: ${{ secrets.TEST_PYPI_API_TOKEN }}
verify-metadata: true
verbose: true
- name: Publish package to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
verify-metadata: true
verbose: true
24 changes: 24 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: test

on:
pull_request: # any pull request
schedule: # run weekly
- cron: "0 12 * * 0"

jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.11"]
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Set up Python environment
uses: ./.github/actions/setup-python-env
with:
python-version: ${{ matrix.python-version }}
- name: Run tests
run: |
uv run python -m pytest
25 changes: 25 additions & 0 deletions .github/workflows/type-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: type-check

on:
pull_request: # any pull request

jobs:
type-check:
runs-on: ubuntu-latest
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Set up Python environment
uses: ./.github/actions/setup-python-env
with:
python-version: "3.10"
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v45
with:
files: |
src/**/*.py
- name: Check types
if: steps.changed-files.outputs.any_changed == 'true'
run: |
uv run python -m mypy --install-types --non-interactive ${{ steps.changed-files.outputs.all_changed_files }}
23 changes: 3 additions & 20 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,6 @@ db.sqlite3-journal
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

Expand All @@ -84,23 +81,13 @@ ipython_config.py
# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
Expand All @@ -110,13 +97,6 @@ ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

Expand All @@ -125,5 +105,8 @@ venv.bak/
.dmypy.json
dmypy.json

# ruff
.ruff_cache/

# Pyre type checker
.pyre/
2 changes: 0 additions & 2 deletions .isort.cfg

This file was deleted.

38 changes: 0 additions & 38 deletions .pre-commit-config.yaml

This file was deleted.

13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# CHANGELOG

## 0.1.0 (2024-11)

- Ported school-agnostic code from private repo, with some refactoring of structure and modest code quality improvements (PR #1 #2 #3 #6 #10)
- Set up Python packaging with `uv` and updated CI workflows (PR #5 #8 #13 #17 #29 #30 #32)
- Extended and improved featurization functionality, including better course grade handling, term- and year-level features, "term diff" features over time (PR: #4 #7 #11 #12 #15 #20 #21 #22 #23)
- Extended and improved target variable functionality, including a new "failure to retain" target and higher-level `make_labeled_dataset()` entry points for each target for developer convenience (PR #24 #33)
- Refactored and better generalized PDP raw data schemas (PR #19 #28)
- Added functionality for generating synthetic PDP and "sample platform" data (PR #9)
- Added generalized "pairwise association" function for comparing variables of various data types (PR #31)
- Added template notebooks for the data assessment/EDA and modeling dataset prep steps of the SST process (PR #26)
- Various minor bugfixes
21 changes: 19 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# DataKind's Student Success Tool (SST)
Customized and easily actionable insights for data-assisted advising, at no cost

Data-assisted advising helps advisors use their limited time to more efficiently identify and reach out to those most in need of help.
Data-assisted advising helps advisors use their limited time to more efficiently identify and reach out to those most in need of help.
Using the Student Success Tool to implement data-assisted advising, John Jay College has reported a 32% increase in senior graduation rates in two years via their CUSP program.
Based on the success of this implementation, DataKind is supported by Google.org to develop this solution with additional postsecondary institutions, at no institutional cost.
This repo is where the google.org fellows team will collaborate with DataKind to develop and ultimately share the open source components of the tool.
Expand Down Expand Up @@ -34,5 +34,22 @@ Current PDP pipeline code: to be built into an actual installable python package

## Contributing

Please read the [CONTRIBUTING](CONTRIBUTING.md) to learn how to contribute to the tool development.
Please read the [CONTRIBUTING](CONTRIBUTING.md) to learn how to contribute to the tool development.


## Setup

### local machine

1. Install `uv` (instructions [here](https://docs.astral.sh/uv/getting-started/installation)).
1. Install Python (instructions [here](https://docs.astral.sh/uv/guides/install-python)). When running on Databricks, we're constrained to PY3.10: `uv python install 3.10`
1. Install this package: `uv pip install -e .`

### databricks notebook

1. Connect notebook to a cluster running Databricks Runtime [14.3 LTS](https://docs.databricks.com/en/release-notes/runtime/14.3lts.html) or [15.4 LTS](https://docs.databricks.com/en/release-notes/runtime/15.4lts.html).
1. Run the `%pip` magic command, pointing it at one of three places:
- a local workspace directory: `%pip install ../../../student-success-tool/`
- a GitHub repo (for a specific branch): `%pip install git+https://github.com/datakind/student-success-tool.git@develop`
- public PyPI: `%pip install student-success-tool` (NOTE: THIS DOESN'T WORK YET)
1. Restart Python, per usual: `dbutils.library.restartPython()`
Loading

0 comments on commit 1956432

Please sign in to comment.