data.zip: drag this folder into "train_model/" and you'll have all the necessary data and files to train the model on the bacterial species Staphylococcus aureus (MSSA476).
all_proteins.pkl: embedding of the proteins used in our analysis. We directly used the embedding generated by DeepFRI since we only fine-tune the output layers of the model.
conversion_dict: folder with files having the correspondence between locus_tag and protein_id.
deepfri_model.hdf5: weights of DeepFRI model
deepfri_terms_names.pkl: GO terms names of predicted originally by DeepFRI
deepfri_terms.pkl: GO terms of predicted originally by DeepFRI
expr-loc: folder with the expression location data. Each row corresponds to a protein. This file is ordered in the same way as in the conversion dictionary
go.obo: DAG we used.
matrix_label: matrix of the labels. This file should be ordered as in the expr-loc file and the conversion dictionary.