Identifying-CRC-biomarkers

All of the tests I ran to support my hypothesis can be found in the "trials" folder. Each trial has its own preprocessing and ML scripts. Please refer to the wiki for a comprehensive overview of the code.

Requirements

All of the requirements for this project can be satisfied by the base Anaconda distribution.

Quick start

If one would like to work in an enviornment with all of the data imported and preprocessed, I recommend working in the "trials/aggregate_datasets/Europe + East Asia/european_eastasian_classifier.py" file. This file contains plenty of pre-written code which can be uncommented and modified depending on what you would like to test. For example, if you wanted to run a classification task where you cross-validate within the Chinese dataset, you would do the following:

Uncomment the train test split line
Uncomment the classifier method you would like to use (randomForest() or logisticRegression())
Pass the datasets into the method (if you wanted to use random forest then it would be: randomForest(X_train, X_test, Y_train, Y_test))
Run the code

If you have any questions feel free to PM me.

Name		Name	Last commit message	Last commit date
Latest commit History 210 Commits
data_container		data_container
method_test_dummy_data		method_test_dummy_data
trials		trials
.DS_Store		.DS_Store
.swp		.swp
LICENSE		LICENSE
README.md		README.md
taxonomicML.py		taxonomicML.py
taxonomicPreprocess.py		taxonomicPreprocess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Identifying-CRC-biomarkers

Requirements

Quick start

About

Releases

Packages

Languages

License

recko10/Identifying-CRC-biomarkers

Folders and files

Latest commit

History

Repository files navigation

Identifying-CRC-biomarkers

Requirements

Quick start

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages