Z(υυ) H(bb) Step by Step Analysis
.. NOTE:: All the scripts must be run from the base directory. Use: source /cvmfs/cms.cern.ch/slc5_amd64_gcc462/lcg/root/5.34.02-cms/bin/thisroot.sh
- Use maketier2list.py to transfer Step 2 ntuples from Pisa to FNAL. edit the directories in the script before run.
python maketier2list.py
- Use
Skim.C
orSkim_backup.C
to skim the Step 2 ntuples with baseline selection. Make sureHelperNtuples.h
is updated. ininputstep2.ini
Section [Skim], edit tagMC, tagData, baseline, mettrigger, metfilter. ininputstep2.ini
Section [Stitch], edit xxxLHECUT's enable reader.write_HelperNtuples(), disable the rest python pyhelper.py copy printout to HelperNtuples.h now run the skim
source run_Skim.sh
- Use
SkimRegression.C
to skim the Step2 ntuples for BDTG regression training. Make sureHelperNtuples.h
is updated. ininputstep2.ini
Section [Skim], edit regression, fjregression enable reader.write_HelperNtuples() in pyhelper.py, disable the rest
python pyhelper.py copy printout to HelperNtuples.h: python pyhelper.py > HelperNtuples.h now run the skim for ak5 regression
source run_SkimRegression.sh now run the skim for filter jet regression source run_SkimRegressionFJ.sh
- Use
TrainRegression.C
andTrainRegressionFJ.C
to produce the BDT regression .xml files. Make sureHelperTMVA.h
is updated. ininputstep2.ini
Section [BDT Regression Variable], edit the variables to use ininputstep2.ini
Section [BDT Regression FJ Variable], edit the variables to use enable reader.write_HelperTMVA() in pyhelper.py, disable the rest python pyhelper.py copy printout to HelperTMVA.h: python pyhelper.py > HelperTMVA.h now run the BDT regression python run_TrainRegression.py cp weights/TMVARegression_BDTG.weights.xml weights/TMVARegression_BDTG.testweights.xml cp TMVAReg.root testTMVAReg.root now run the BDT regression for FJ python run_TrainRegressionFJ.py cp weights/TMVARegressionFJ_BDTG.weights.xml weights/TMVARegressionFJ_BDTG.testweights.xml cp TMVARegFJ.root testTMVARegFJ.root
To check the regression performances run python ComparePtResolution.py python ComparePtOffset.py python CompareMass_sig.py
-
Update
HelperNtuples.h
to have all the correct numbers. enable skimmer.process() in skimmer.py, disable the rest python skimmer.py inputstep2.ini copy printout to inputstep2.ini Section [Process] enable skimmer.stitch() in skimmer.py, disable the rest python skimmer.py inputstep2.ini copy printout to inputstep2.ini Section [Stitch] enable reader.write_HelperNtuples() in pyhelper.py, disable the rest python pyhelper.py copy printout to HelperNtuples.h: python pyhelper.py > HelperNtuples.h -
Use
GrowTree.C
to create Step 3's. python run_GrowTree.py -
Use
SkimClassification.C
to produce the files for BDT classification. source run_SkimClassification.sh -
Use
TrainBDT.C
to produce the BDT regression .xml files. source run_TrainBDT.sh -
Use
TrimTree.C
to create Step 4's. ininputstep2.ini
Section [Weight], edit the MC event weights ininputstep2.ini
Section [Trigger], edit the Data trigger ininputstep2.ini
Section [Selection], edit the signal and control regions enable reader.write_HelperTMVA() in pyhelper.py, disable the rest python pyhelper.py copy printout to HelperTMVA.h: python pyhelper.py > HelperTMVA.h inTrimTree.C
, edit the BDT .xml files to load, then run python run_TrimTree.py stitch the Step 4's (remember to edit $DIR in the script) source run_stitch.sh -
run for now cd macros root -l -b -q plotHistos_Znn_13TeV_BDT.C++ source results_Znn.csh
later we should do
10. Use ScaleFactorJ11.C
and ScaleFactorWorkspaceJ11.C
to fit scale factors.
.. code:: bash
set the variable indir in the scripts, reset the fitresults arrays, reset the iteration number in calc_scalefactors(...)
root -l -b -q ScaleFactorJ11.C++O
root -l -b -q ScaleFactorWorkspaceJ11.C++O
use MaxLikelihoodFit to get the nuisances
combineCards.py ZnunuHighPt_WjLF=vhbb_Znn_SF_J11_ZnunuHighPt_WjLF_8TeV.txt ZnunuHighPt_WjHF=vhbb_Znn_SF_J11_ZnunuHighPt_WjHF_8TeV.txt ZnunuHighPt_ZjLF=vhbb_Znn_SF_J11_ZnunuHighPt_ZjLF_8TeV.txt ZnunuHighPt_ZjHF=vhbb_Znn_SF_J11_ZnunuHighPt_ZjHF_8TeV.txt ZnunuHighPt_TT=vhbb_Znn_SF_J11_ZnunuHighPt_TT_8TeV.txt > vhbb_Znn_SF_J11_ZnunuHighPt_8TeV.txt
combine -M MaxLikelihoodFit -m 125 --robustFit=1 --stepSize=0.05 --rMin=-5 --rMax=5 --saveNorm vhbb_Znn_SF_J11_ZnunuHighPt_8TeV.txt
python printNuisances.py mlfit.root
- Use
BDTShapeJ12.C
andBDTShapeWorkspaceJ12.C
to get final limit and significance. set the variable indir in the scripts, adjust binning, then run root -l -b -q BDTShapeJ12.C++O root -l -b -q BDTShapeWorkspaceJ12.C++O calculate expected limit and significance combine -M Asymptotic -t -1 vhbb_Znn_J11_ZnunuHighPt_8TeV.txt combine -M ProfileLikelihood -m 125 --signif --pvalue -t -1 --toysFreq --expectSignal=1 vhbb_Znn_J11_ZnunuHighPt_8TeV.txt combine -M MaxLikelihoodFit -m 125 --robustFit=1 --stepSize=0.05 --rMin=-5 --rMax=5 --saveNorm -t -1 --toysFreq --expectSignal=1 vhbb_Znn_J11_ZnunuHighPt_8TeV.txt python diffNuisances.py -f html mlfit.root calculate observed limit and significance combine -M Asymptotic vhbb_Znn_J11_ZnunuHighPt_8TeV.txt combine -M ProfileLikelihood -m 125 --signif --pvalue vhbb_Znn_J11_ZnunuHighPt_8TeV.txt combine -M MaxLikelihoodFit -m 125 --robustFit=1 --stepSize=0.05 --rMin=-5 --rMax=5 --saveNorm vhbb_Znn_J11_ZnunuHighPt_8TeV.txt python diffNuisances.py -f html mlfit.root
Tuning
-
In directory
tuneBDT
, useprepare_tuneBDT.sh
to create symbolic links. This step is only needed at the first time. edit the links if necessary source prepare_tuneBDT.sh -
Use
BDTShapeJ12_reload.C.diff
to patchBDTShapeJ12_reload.C
.
the patch was created with diff -u BDTShapeJ12_reload.C.orig BDTShapeJ12_reload.C
it might only work with a specific version of BDTShapeJ12.C
patch -p0 < BDTShapeJ12_reload.C.diff
-
Use
run_tune.py
to get commands to train and reload. edit r, n, w if necessary python run_tune.py first train, then reload -
Use
run_combine_tune.py
to calculate final limit and significance. .. code:: bash edit r, n, w, logs if necessary python run_combine_tune.py -
Use
retrieve_combine_tune.py
to store the previous results into TTrees. .. code:: bash edit r, w, logs if necessary it only does one channel at a time python run_combine_tune.py usehadd
to combine the TTrees -
Open the combined TTree (e.g. TMVA_ZnunuHighPt_new.root), scan for the best significance. .. code:: bash root [0] fomtree->Scan("NTrees:nEventsMin:MaxDepth:COMB_limit:COMB_signif:TMVA_kolS:TMVA_kolB:USER_psig","COMB_limit<3.6 && COMB_signif>0.68")
-
BDT binning optimization instructions pending...
-
Find yield uncertainties (only for ZbbHinv cards). .. code:: bash edit nuisances, processes, soverb if necessary python systematica_ZbbHinv.py