Capstone2_Harvard_edX

In partial fulfillment of the requirements for the Harvard edX: Data Science Professional Certificate, this repository contains the following files:

R Markdown: Capstone_Two_Report_Haslam_2019_03_12.Rmd
R_Script: Capstone_Two_Script.r
Report in PDF: Capstone_Two_Report_Haslam_2019_03_12.pdf
Extra: Capstone_Two_Report_Haslam_2019_03_12.html

And the original Breast Cancer Wisconsin (Diagnostic) data set (WDBC), available from the UCI Machine Learning Repository, Center for Machine Learning and Intelligent Systems, University of California, Irvine:

The script (and RMD) import the data set from the UCI source, so there is no need to download it first.

Please note:
The RMD will take a minimum of 40 minutes -- and more likely over an hour -- to run. It also requires that the user has installed a number of ML packages for R, consistent with those used in for ensemble modelling in the Harvard edX course on Machine Learning.

The script largely runs silently (the output captured). Any warnings or error messages may be safely ignored. Not every model works perfectly on each testing condition/variation, which is the point of testing the various models against similar controlled conditions.

Thank you,
Thom J. Haslam
March 12, 2019

Update: 2019-03-14

I thank the Harvard edX peer and staff reviewers for their encouraging and helpful comments. One suggestion was to change the loading procedure in the RMD from

library(tidyverse)
library(caret) # etc

To

if(!require(tidyverse)) install.packages("tidyverse", repos = "http://cran.us.r-project.org")
if(!require(caret)) install.packages("caret", repos = "http://cran.us.r-project.org") # etc

Which will ensure that if someone is missing the needed packages, the packages will be installed from CRAN so that the RMD runs without terminating by error. (Please see Packages_Required_Set_up.R).

This is an excellent suggestion, so I will update the script and the RMD (by 15 March 2019) for future use/reference. I will also take one last crack at fixing any typos or infelicities of expression in the report, even though the project has received full marks (50 out of 50) and for all practical purpose is done: certificate earned!

Otherwise, I will leave this Machine Learning project up as an archive: as part of what I hope will be a growing R for Data Science portfolio.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
Capstone_Two_Report_Haslam_2019_03_12.Rmd		Capstone_Two_Report_Haslam_2019_03_12.Rmd
Capstone_Two_Report_Haslam_2019_03_12.html		Capstone_Two_Report_Haslam_2019_03_12.html
Capstone_Two_Report_Haslam_2019_03_12.pdf		Capstone_Two_Report_Haslam_2019_03_12.pdf
Capstone_Two_Script.r		Capstone_Two_Script.r
PCA_graph.png		PCA_graph.png
Packages_Required_Set_up.R		Packages_Required_Set_up.R
README.md		README.md
Run_two_graph.png		Run_two_graph.png
wdbc.data.csv		wdbc.data.csv
wdbc.names.txt		wdbc.names.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Capstone2_Harvard_edX

Update: 2019-03-14

About

Releases

Packages

Languages

Thom-J-H/Capstone2_Harvard_edX

Folders and files

Latest commit

History

Repository files navigation

Capstone2_Harvard_edX

Update: 2019-03-14

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages