-
Notifications
You must be signed in to change notification settings - Fork 1
sandrasiby/ML_Project_1
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The zip file contains the following files: # IMPORTANT : When using logistic_regression, reg_logistic_regression, logistic_regression_newton, reg_logistic_regression_newton, the initial weight vector should be of shape (n,) and NOT (n,1) *1* run.py : This is the main file to be run. As it is, it will perform regularized logistic regression using Newton's method. The basic outline of the file is as follows: - Read the training and test data - Split both sets of data (test and training) by the jet numbers 0 - 3. Split all the jets further by valid or invalid DER_MASS_MMC. For each jet, remove the features that are either not calculated (-999) or deemed irrelevant (See [1]). - Loop through each data set (8 of them, 2 for each jet - 1 with valid DER_MASS_MMC and the other with invalid DER_MASS_MMC) - Standardize feature set after removing outliers above a threshold specified in list_sd_limit - Conduct a polynomial expansion of the features (degree 2) - Obtain the weights and final loss function by using the desired classification model (currently logistic regression with Newton's method and lambda = 1. - Standardize test feature set using the mean and standard deviation of the training data set. - Get the predictions for the test data set for the current jet and mass validity. - Add the predictions to the overall prediction list - Print the predictions to file, with the ids and corresponding predictions *2* proj1_helpers.py - - Standardization and outlier removal - Read, write data - Obtain and verify predictions - Split data by jet numbers + removal of unnecessary features - Polynomial expansion to the specified degree - Split data into test and training for cross validation *3* implementations.py : All the regression models required - Linear: - Normal equations - Ridge regression - Gradient Descent - Stochastic Gradient Descent - Logistic - Gradient Descent - Regularized Gradient Descent - Newton's - Regularized Newton's - Other functions required to calculate gradients, the log function and the Hessian * ----------------------------------------------------------------------------------------------------------------------- * [1] Features removed for each jet number Jet 0, valid DER_MASS_MMC --> [ 4, 5, 6, 8, 12, 15, 18, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] Jet 1, valid DER_MASS_MMC --> [ 4, 5, 6, 12, 15, 18, 20, 22, 25, 26, 27, 28, 29, 31, 32] Jet 2, valid DER_MASS_MMC --> [ 15, 18, 20, 22, 25, 28 ] Jet 4, valid DER_MASS_MMC --> [ 15, 18, 20, 22, 25, 27, 29 ] Jet 0, invalid DER_MASS_MMC --> [0, 4, 5, 6, 8, 12, 15, 18, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32] Jet 1, invalid DER_MASS_MMC --> [0, 4, 5, 6, 12, 15, 18, 20, 22, 25, 26, 27, 28, 29, 31, 32] Jet 2, invalid DER_MASS_MMC --> [0, 15, 18, 20, 22, 25, 28 ] Jet 3, invalid DER_MASS_MMC --> [0, 15, 18, 20, 22, 25, 27, 29 ] 0 DER_mass_MMC 15 PRI_tau_phi 30 CUSTOM_assymenergy 4 DER_deltaeta_jet_jet 18 PRI_lep_phi 31 CUSTOM_delta_phi 5 DER_mass_jet_jet 20 PRI_met_phi 32 CUSTOM_avg_phi 6 DER_prodeta_jet_jet 22 PRI_jet_num 33 CUSTOM_special 8 DER_pt_tot 23 PRI_jet_leading_pt 12 DER_lep_eta_centrality 24 PRI_jet_leading_eta 25 PRI_jet_leading_phi 26 PRI_jet_subleading_pt 27 PRI_jet_subleading_eta 28 PRI_jet_subleading_phi 29 PRI_jet_all_pt
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published