health-project-mlops

Estimating Time from Referral to Procurement using Organ Retrieval and Collection of Health Information for Donation from physionet.org. This project is linked with mlops datatalks to MLOps Zoomcamp 2024 course!.

Estimating Time from Referral to Procurement

Objective

The goal of this project is to predict the time interval between hospital referral and organ procurement using a machine learning model. This helps healthcare professionals estimate the procurement timeline, potentially improving the efficiency and planning of organ transplants.

Features Used

We selected the following features to train our model, a description in deep is in html file or physionet.org:

Age: The age of the patient. (Numerical)
Gender: The gender of the patient. (Categorical)
Race: The race of the patient. (Categorical)
HeightIn: The height of the patient in inches. (Numerical)
WeightKg: The weight of the patient in kilograms. (Numerical
blood_type: A combination of the ABO Blood Type and the Rh factor (positive or negative). (Categorial)
brain_death: A boolean indicating if brain death has occurred. (Categorical)

Target Variable

The target variable, time_to_procurement, is calculated as the difference between time_procured and time_referred, converted into hours. The Random Forest model was chosen for its ability to handle complex, non-linear relationships and interactions among features. The model's performance was evaluated using metrics like Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). Feature importance was visualized to understand the contribution of each feature to the model's predictions.

Model Used

We used a Random Forest regression model for this task.

Load data

This are the steps to follow in order to update credential access or cookies for the project to run propertly. this project has two ways of loading data. The first is by using a Cookie and updating you credenitals in the docker compose file, the second is updating your credenitals in the docker compose file. Both cases you need to sign up and generate credentials. Below I explain how to do it:

go to the link physionet.org and click on Account on the upper-right corner of the screen.
fill the information with a valid email
check you email inbos/junk/spam an email from [email protected]. Click on the link to activate your account.
you will be asked for update your password. after you generate your password, click on "Activate" buttom.
sign in with your credentials and go to this page [physionet.org](https:// doi.org/10.13026/b1c0-3506)
go to the buttom of the page and click on the link that says "sign the data use agreement for the project"
click on "Agree"
click on the link to go to the dataset or click here physionet.org
go to the buttom of the page and you will see this:

The next steps depends on what method you would use to download the data. I list here two main methods, the first is collecting a Cookie from a download from physionet website for only the data used. The second, is simply update the credentials in the docker compose file on root. The following steps are the steps needed to run the pipelines and download the data successfully:

Get Cookie Env Variable

in order to download the data and get the Cookie, press F12 (or inspect element with right-click) go to "network" panel and click on download "referrals.csv"
one csv file is downloaded, right-click on GET method that you should see on the network panel if you follow step 10. Then click on "copy cURL"
use any cURL of your preference like "Postman". I suggest convert to convert the cURL in a python script.
copy cookie and replace the new cookie on the docker compose env

update user and password

open docker-compose.yml and update PHYSIONET_USERNAME and PHYSIONET_PASSWORD with your credentials. Do not include any aditional character.

project run

On this project you could upload to a cloud using docker compose up with the docker-compose.yml file on root directory. To run it locally you need a docker running on you machine and run the following comand:

docker-compose up --build

after a few minutes, port 4200 will be in use by prefect server, port 5000 will be in use by mlflow server, port 5232, and 8080 will be in use by postgreSQL and adminer respectivle. And port 3000 will be use by grafana. At the end, port 9696 will be listening to predict values. An example of curl code is below:

curl --location 'http://localhost:9696/predict' \
--header 'Content-Type: application/json' \
--header 'Content-Type: application/json' \
--data '{
    "Age": 60,
    "HeightIn": 68,
    "WeightKg": 70,
    "brain_death": 0,
    "Gender_M": 1,
    "Race_Hispanic": 0,
    "Race_Other / Unknown": 0,
    "Race_White / Caucasian": 1,
    "blood_type_A-Negative ": 0,
    "blood_type_A-Positive": 0,
    "blood_type_A-Positive ": 0,
    "blood_type_A1-Negative": 0,
    "blood_type_A1-Negative ": 0,
    "blood_type_A1-Positive": 0,
    "blood_type_A1-Positive ": 0,
    "blood_type_A1B-Negative": 0,
    "blood_type_A1B-Negative ": 0,
    "blood_type_A1B-Positive": 0,
    "blood_type_A1B-Positive ": 0,
    "blood_type_A2-Negative": 0,
    "blood_type_A2-Negative ": 0,
    "blood_type_A2-Positive": 0,
    "blood_type_A2-Positive ": 0,
    "blood_type_A2B-Negative": 0,
    "blood_type_A2B-Positive": 1,
    "blood_type_A2B-Positive ": 0,
    "blood_type_AB-Negative": 0,
    "blood_type_AB-Negative ": 0,
    "blood_type_AB-Positive": 0,
    "blood_type_AB-Positive ": 0,
    "blood_type_B-Negative": 0,
    "blood_type_B-Negative ": 0,
    "blood_type_B-Positive": 0,
    "blood_type_B-Positive ": 0,
    "blood_type_O-Negative": 0,
    "blood_type_O-Negative ": 0,
    "blood_type_O-Positive": 0,
    "blood_type_O-Positive ": 0
}'

With this, you will get the expected days to get a donation from donor info.

Best practices

linting and formatting

In case you want to check formating or linting you can use the following commands:

pylint src/
black src/
isort src/

The pylint code passed with 10/10 and black and isort do not suggest any changes.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
config		config
dashboards		dashboards
image_readme		image_readme
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_description.html		data_description.html
docker-compose.yml		docker-compose.yml
mlflow.dockerfile		mlflow.dockerfile
output.dockerfile		output.dockerfile
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
workflow.dockerfile		workflow.dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

health-project-mlops

Estimating Time from Referral to Procurement

Objective

Features Used

Target Variable

Model Used

Load data

Get Cookie Env Variable

update user and password

project run

Best practices

linting and formatting

About

Releases

Packages

Languages

License

juanes-grimaldos/health-project-mlops

Folders and files

Latest commit

History

Repository files navigation

health-project-mlops

Estimating Time from Referral to Procurement

Objective

Features Used

Target Variable

Model Used

Load data

Get Cookie Env Variable

update user and password

project run

Best practices

linting and formatting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages