You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the parameters in the MLE result need to match the parameters in the model for which the prediction is calculated exactly. If a prediction is to be calculated for a model with less parameters (e.g. one channel less), then it can happen that the new model has less parameters and the calculation will work without error, but be wrong (scikit-hep/pyhf#1459). If a prediction should be calculated for a model with more parameters (e.g. to also get post-fit distributions for validation regions), then the MLE result will be missing parameters (e.g. staterror modifiers for the validation regions) and the calculation will fail.
A utility that can help with these cases (something like model_utils.match_fit_results(model, fit_results)) would be useful here. It should
remove parameters from the fit_results object which are not in the target model,
add parameters that are missing in fit_results but contained in the model, for example staterror modifiers for new bins,
re-order parameters such that fit_results.labels matches model.config.par_names.
When adding parameters, the best-fit values / uncertainties / correlations for them cannot be known, so they should be set to sensible defaults such as model_utils.asimov_parameters, model_utils.prefit_uncertainties, and no correlation with other parameters.
A workflow could then look like the following:
fit_results=cabinetry.fit.fit(model, data)
pred_postfit=cabinetry.model_utils.prediction(model, fit_results=fit_results)
# example: prune a sample from a model with some function, then calculate post-fit prediction againnew_model=custom_prune_model(model)
fit_results_for_new_model=model_utils.match_fit_results(new_model, fit_results)
pred_postfit_new=cabinetry.model_utils.prediction(new_model, fit_results=fit_results_for_new_model)
This could in principle also be done automatically within model_utils.prediction, but keeping this very explicit may make it more clear what is happening (and the assumptions going into this).
When using a MLE result to obtain a post-fit model prediction
the parameters in the MLE result need to match the parameters in the model for which the prediction is calculated exactly. If a prediction is to be calculated for a model with less parameters (e.g. one channel less), then it can happen that the new model has less parameters and the calculation will work without error, but be wrong (scikit-hep/pyhf#1459). If a prediction should be calculated for a model with more parameters (e.g. to also get post-fit distributions for validation regions), then the MLE result will be missing parameters (e.g.
staterror
modifiers for the validation regions) and the calculation will fail.A utility that can help with these cases (something like
model_utils.match_fit_results(model, fit_results)
) would be useful here. It shouldfit_results
object which are not in the targetmodel
,fit_results
but contained in themodel
, for examplestaterror
modifiers for new bins,fit_results.labels
matchesmodel.config.par_names
.When adding parameters, the best-fit values / uncertainties / correlations for them cannot be known, so they should be set to sensible defaults such as
model_utils.asimov_parameters
,model_utils.prefit_uncertainties
, and no correlation with other parameters.A workflow could then look like the following:
This could in principle also be done automatically within
model_utils.prediction
, but keeping this very explicit may make it more clear what is happening (and the assumptions going into this).This came up in discussions with @Daniel-Noel.
The text was updated successfully, but these errors were encountered: