Production of gg → H → 𝜏𝜏 samples with variable Higgs mass.
The events are generated with Pythia8
in leading order and interfaced with FastSIM
for a fast simulation of the CMS detector. The workflow is managed with law.
The software environment is set up with the setup.sh
script in the root directory of this project. Executing
source setup.sh
installs missing software requirements and sets the shell environment for the production workflow.
The root object of the Monte Carlo production chain is the fragment which contains information about the physics process at generator level. For the generation of gg → H → 𝜏𝜏 events with Pythia8
a modification of the fragment GGToHtautau_13TeV_pythia8_cff.py
from CMSSW
is used. The original fragment is available in the cms-sw/cmssw GitHub repository.
Some amendmends to the original fragment have been made:
-
The tune
CP5
is used instead ofCUEP8M1
. -
onIfMatch
is used instead ofonIfAny
. This explicitly requires a Higgs boson decay into a pair of opposite-sign 𝜏 leptons -
the threshold mass
mMin
of the Higgs boson is reduced from 50 GeV to 25 GeV in order to account the production of samples with Higgs masses around that threshold.
The most important part of the fragment is the processParameters
section:
[...]
processParameters = cms.vstring(
'HiggsSM:gg2H = on',
'25:onMode = off',
'25:onIfMatch = 15 -15',
'25:m0 = 125.0',
'25:mMin = 25.0',
),
[...]
This set of parameters defines the hard process for event generation. Higgs bosons that are produced via gluon-gluon fusion exclusively decay into pairs of 𝜏 leptons. The parameter 25:m0
describes the maximum of Breit-Wigner curve of the mass spectrum. Especially for higher masses 25:m0
must not be identified with the actual mass of the generated Higgs boson as the decay width of the Higgs boson increases towards higher values of 25:m0
.
The fragments for all processes can be generated with the command:
law run FragmentGeneration
The generated files are placed in the src/Configuration/GenProduction/python
directory of the CMSSW installation that is set up during the execution of the setup.sh
script. Before setting up the configuration for the production the CMSSW release has to be compiled. The compilation can be triggered with:
law run CompileCMSSW
Configuration files for event production in CMSSW are created with the cmsDriver.py
tool.
When performing the detector simulation with FastSIM
separate steps like event generation, detector simulation, pileup premixing and reconstruction can be summarized in one production step. The following command creates a configuration file for the production of a AODSIM
file:
cmsDriver.py Configuration/GenProduction/python/GluGluHToTauTau_MH50_pythia8_TuneCP5_cff.py \
--python_filename GluGluHToTauTau_MH50_pythia8_TuneCP5_aodsim_0_cfg.py \
--fileout file:GluGluHToTauTau_MH50_pythia8_TuneCP5_aodsim_0.root \
--customise Configuration/DataProcessing/Utils.addMonitoring \
--customise_commands from IOMC.RandomEngine.RandomServiceHelper import RandomNumberServiceHelper;randSvc = RandomNumberServiceHelper(process.RandomNumberGeneratorService);randSvc.populate() \
--era Run2_2018_FastSim \
--conditions 106X_upgrade2018_realistic_v16_L1v1 \
--beamspot Realistic25ns13TeVEarly2018Collision \
--procModifiers premix_stage2 \
--step GEN,SIM,RECOBEFMIX,DIGI,DATAMIX,L1,DIGI2RAW,L1Reco,RECO \
--eventcontent AODSIM \
--datatier AODSIM \
--datamix PreMix \
--pileup_input dbs:/Neutrino_E-10_gun/RunIIFall17FSPrePremix-PUFSUL18CP5_106X_upgrade2018_realistic_v16-v1/PREMIX \
--fast \
--no_exec \
--mc \
--number 2000
The most important arguments are:
-
Configuration/GenProduction/python/GluGluHToTauTau_MH50_pythia8_TuneCP5_cff.py
- path to the fragment relative to the CMSSW root directory that contains information about the event generation -
--python_filename GluGluHToTauTau_MH50_pythia8_TuneCP5_aodsim_0_cfg.py
- name of the output configuration file -
--fileout file:GluGluHToTauTau_MH50_pythia8_TuneCP5_aodsim_0.root
- name of the output file of the production -
--customise_commands from IOMC.RandomEngine.RandomServiceHelper import RandomNumberServiceHelper;randSvc = RandomNumberServiceHelper(process.RandomNumberGeneratorService);randSvc.populate()
- providing a random seed for random number generators which is especially important when the production of the Monte Carlo dataset is splitted -
--era Run2_2018_FastSim
,--conditions 106X_upgrade2018_realistic_v16_L1v1
,--beamspot Realistic25ns13TeVEarly2018Collision
- define the environment of the production regarding the detector setup, alignment and calibration conditions as well as the beam setup -
--step GEN,SIM,RECOBEFMIX,DIGI,DATAMIX,L1,DIGI2RAW,L1Reco,RECO
- production steps; the chain that is defined here can be understood as follows: event generation → detector simulation → reconstruction → digitization of detector signals → premixing of the pileup input → L1 trigger simulation → conversion of digitized signals into raw data → L1 trigger reconstruction → reconstruction -
--eventcontent AODSIM
- specification of the event format in the output file -
--datamix PreMix
,--pileup_input dbs:/Neutrino_E-10_gun/RunIIFall17FSPrePremix-PUFSUL18CP5_106X_upgrade2018_realistic_v16-v1/PREMIX
- pileup dataset for premixing -
--fast
- perform a fast simulation of the detector -
--mc
- produce a Monte Carlo dataset -
--number 2000
- number of produced events
The workflow creates a configuration template for each value the Higgs mass that is defined in the configuration. The production of the dataset is split in order to speed up the production by enabling parallel processing of the events. Therefore a configuration for the production of each file is generated from the configuration templates.
When the generation of configuration files has been finished the production can be started. The CMSSW command for producing the file configured with the cmsDriver.py
tool above would be:
cmsRun GluGluHToTauTau_MH50_pythia8_TuneCP5_aodsim_0_cfg.py
In context of the workflow for this production all of the tasks described above can be triggered with a single command:
law run AODSIMProduction
Configuration files for the production are generated on demand if they do not exist yet. After having ensured that all requirements are fulfilled jobs for the production of AODSIM
files are submitted to the ETP HTCondor batch system.
It might be appropriate to not submit all jobs at once. The set of submitted jobs can be reduced by either producing files for a selected range of branches,
law run AODSIMProduction --branches 0:400
or by explicitly setting the allowed number of parallel jobs:
law run AODSIMProduction --parallel-jobs 400
Before obtaining the final dataset files for the analysis an intermediate files in the MINIAOD
format have to be produced. The cmsDriver.py
command for the configuration of such a production looks like this:
cmsDriver.py --python_filename GluGluHToTauTau_MH50_pythia8_TuneCP5_miniaod_0_cfg.py \
--filein file:GluGluHToTauTau_MH50_pythia8_TuneCP5_aodsim_0.root \
--fileout file:GluGluHToTauTau_MH50_pythia8_TuneCP5_miniaod_0.root \
--customise Configuration/DataProcessing/Utils.addMonitoring \
--customise_commands from IOMC.RandomEngine.RandomServiceHelper import RandomNumberServiceHelper;randSvc = RandomNumberServiceHelper(process.RandomNumberGeneratorService);randSvc.populate() \
--era Run2_2018 \
--conditions 106X_upgrade2018_realistic_v16_L1v1 \
--beamspot Realistic25ns13TeVEarly2018Collision \
--procModifiers run2_miniAOD_UL \
--step PAT \
--eventcontent MINIAODSIM \
--datatier MINIAODSIM \
--geometry DB:Extended \
--runUnscheduled \
--fast \
--no_exec \
--mc \
--number -1
In addition to the output file specified with --fileout
the input AODSIM
file has to be passed as value of the argument --filein
. Setting --number
to the value of -1
means that all event from the input file are processed and saved to the output file.
As for the AODSIM
production configurations are generated using configuration templates for each defined value of the Higgs mass. The production of MINIAOD files in jobs on the ETP HTCondor batch system is triggered with
law run MINIAODProduction
Missing configuration files and AODSIM
files are produced on demand.
Also here it might be appropriate to not submit all jobs at once, which can be achieved by using the --branches
or --parallel-jobs
command line option.
For effective usage in analyses files in the NANOAOD
format have to be produced. The cmsDriver.py
command for configuring such a production looks like this:
cmsDriver.py --python_filename GluGluHToTauTau_MH50_pythia8_TuneCP5_nanoaod_0_cfg.py \
--filein file:GluGluHToTauTau_MH50_pythia8_TuneCP5_miniaod_0.root \
--fileout file:GluGluHToTauTau_MH50_pythia8_TuneCP5_nanoaod_0.root \
--customise Configuration/DataProcessing/Utils.addMonitoring \
--customise_commands from IOMC.RandomEngine.RandomServiceHelper import RandomNumberServiceHelper;randSvc = RandomNumberServiceHelper(process.RandomNumberGeneratorService);randSvc.populate() \
--era Run2_2018,run2_nanoAOD_106Xv2 \
--conditions 106X_upgrade2018_realistic_v16_L1v1 \
--beamspot Realistic25ns13TeVEarly2018Collision \
--eventcontent NANOAODSIM \
--datatier NANOAODSIM \
--step NANO \
--fast \
--no_exec \
--mc \
--number -1
The production of NANOAOD
files is triggered with:
law run NANOAODProduction
Also here, missing configuration files as well as missing AODSIM
and MINIAOD
files are produced on demand before starting the production of NANOAOD
files. NANOAOD
files are produced within jobs on the ETP HTCondor batch system. The scope of the production can be limited with the --branches
or the --parallel-jobs
command line option.