Skip to content

Commit

Permalink
Version for submission to PLOS CB
Browse files Browse the repository at this point in the history
  • Loading branch information
chrisbanks committed Oct 12, 2024
1 parent ddc82a3 commit 194086f
Show file tree
Hide file tree
Showing 11 changed files with 9,567 additions and 0 deletions.
1,058 changes: 1,058 additions & 0 deletions Data_Curation.ipynb

Large diffs are not rendered by default.

425 changes: 425 additions & 0 deletions Data_Curation_UKFarmcare.ipynb

Large diffs are not rendered by default.

68 changes: 68 additions & 0 deletions Data_Curation_VetOnly.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "4fafef82-3964-415b-9979-7c96f76c73ad",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "53aa7adc-fc8f-4a34-9b00-80b78c5cda6b",
"metadata": {},
"outputs": [],
"source": [
"# Load dataset\n",
"data = pd.read_csv('/Data/TB_Diagnostics/inputVars.csv', low_memory=False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "81e05216-f5d8-4c13-8997-53d5f7997ee0",
"metadata": {},
"outputs": [],
"source": [
"# Remove rows with no vet practice information\n",
"data_vet_only = data.dropna(subset=['vetPractice'])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1a2387f4-2943-4864-a5a5-ca4e6e87f9e4",
"metadata": {},
"outputs": [],
"source": [
"# Output\n",
"data_vet_only.to_csv('/Data/TB_Diagnostics/inputVars_VetOnly.csv', index=False)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
13 changes: 13 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# bTB-diagnostics

This project trains a Histogram Boosted Regression Tree model on data from Bovine Tuberculosis (bTB) testing and cattle herd metadata to predict the risk of bTB outbreak.

This can be used to improve the herd-level sensitivity or specificity of the diagnostic test and also to analyse the risk factors involved in predicting bTB outbreaks.

The project consists of a number of Jupyter Notebooks:
(i) Data_Curation* -- processes the various inpiut data into a matrix for model training.
(ii) bTB-Diagnostic_2020_v4_crossVal+tuning* -- code that trains the various models.
(iii) bTB-Diagnostic_2020_final_model* -- code that performs various analysis on the models.
(iv) Vet_Data_Analysis -- code that performs some extra analysis on the veterinary data.

Further details can be found in the preprint (paper in sumbission for peer review) at: https://arxiv.org/abs/2404.03678
Loading

0 comments on commit 194086f

Please sign in to comment.