Skip to content

squareleaf/bee-colony-loss-capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summary:

Initial hypothesis: Causes of bee hive die-off between states are different enough that different disease prevention tactics should be determined by region or sub-region.

Conclusion: Using linear ridge regression and random forest regression with multiple combinations of independent variables as predictors demonstrates that region and sub-region data do not have significant predictive power in determining causes of colony loss. More robust data collection of colony stressors are required for future predictive models.

Data obtained from:

USDA data collected by Cornell University

Partially cleaned USDA data on Kaggle

Files:

Initial data: /raw_data

Project proposal: Unit 7 - Capstone Data project proposal (single project).pdf

Data wrangling and cleaning: Bee Colony Capstone - data cleaning.ipynb

Exploratory data analysis: Bee Colony Capstone - Exploratory Data Analysis.ipynb

Pre-processing: Bee Colony Capstone - Preprocessing.ipynb

Initial attempts at time series analysis: Bee Colony Capstone - first pass at modeling.ipynb

Regression models: Bee Colony Capstone - regression models.ipynb

Slide deck presentation: Capstone_presentation_bee_colony_data.pdf

Final report: Final report - bee colony capstone.pdf

Remainder of files are csv files of processed data between stages of analysis.

About

Capstone project for Springboard Data Science program

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published