Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utility to match fit results to modified model #284

Closed
alexander-held opened this issue Sep 22, 2021 · 0 comments · Fixed by #288
Closed

Utility to match fit results to modified model #284

alexander-held opened this issue Sep 22, 2021 · 0 comments · Fixed by #288
Labels
enhancement New feature or request

Comments

@alexander-held
Copy link
Member

alexander-held commented Sep 22, 2021

When using a MLE result to obtain a post-fit model prediction

fit_results = cabinetry.fit.fit(model, data)
postfit_prediction = cabinetry.model_utils.prediction(model, fit_results=fit_results)

the parameters in the MLE result need to match the parameters in the model for which the prediction is calculated exactly. If a prediction is to be calculated for a model with less parameters (e.g. one channel less), then it can happen that the new model has less parameters and the calculation will work without error, but be wrong (scikit-hep/pyhf#1459). If a prediction should be calculated for a model with more parameters (e.g. to also get post-fit distributions for validation regions), then the MLE result will be missing parameters (e.g. staterror modifiers for the validation regions) and the calculation will fail.

A utility that can help with these cases (something like model_utils.match_fit_results(model, fit_results)) would be useful here. It should

  • remove parameters from the fit_results object which are not in the target model,
  • add parameters that are missing in fit_results but contained in the model, for example staterror modifiers for new bins,
  • re-order parameters such that fit_results.labels matches model.config.par_names.

When adding parameters, the best-fit values / uncertainties / correlations for them cannot be known, so they should be set to sensible defaults such as model_utils.asimov_parameters, model_utils.prefit_uncertainties, and no correlation with other parameters.

A workflow could then look like the following:

fit_results = cabinetry.fit.fit(model, data)
pred_postfit = cabinetry.model_utils.prediction(model, fit_results=fit_results)

# example: prune a sample from a model with some function, then calculate post-fit prediction again
new_model = custom_prune_model(model)
fit_results_for_new_model = model_utils.match_fit_results(new_model, fit_results)
pred_postfit_new = cabinetry.model_utils.prediction(new_model, fit_results=fit_results_for_new_model)

This could in principle also be done automatically within model_utils.prediction, but keeping this very explicit may make it more clear what is happening (and the assumptions going into this).

This came up in discussions with @Daniel-Noel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant