Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input validation for pyhf.pdf.Model.expected_data #1459

Open
alexander-held opened this issue May 17, 2021 · 0 comments · May be fixed by #1461
Open

Input validation for pyhf.pdf.Model.expected_data #1459

alexander-held opened this issue May 17, 2021 · 0 comments · May be fixed by #1461
Labels
user request Request coming form a pyhf user

Comments

@alexander-held
Copy link
Member

Question

What do you think about checking the length of the input to pyhf.pdf.Model.expected_data? That function feels like a high-level entry point (though model.main_model.expected_data may also be needed as entry point for some users, see #1031). Currently an IndexError is raised if the argument has too few parameters, and arguments with too many parameters will cause no errors or warnings, but interpret the first model.config.npars parameters as those to use for the calcultion.

Example:

import pyhf

model = pyhf.simplemodels.uncorrelated_background(
    signal=[12.0, 11.0], bkg=[50.0, 52.0], bkg_uncertainty=[3.0, 7.0]
)
pars = model.config.suggested_init()
print(f"model has {model.config.npars} parameters")
p = [1.1, 1.2, 1.3]
p_long = p + [1.5] * 5
p_short = [1.1, 1.2]
print(f"prediction for {p}: {model.expected_data(p)}")
print(f"prediction for {p_long}: {model.expected_data(p_long)}")
print(f"prediction for {p_short}: {model.expected_data(p_short)}")

Output:

model has 3 parameters
prediction for [1.1, 1.2, 1.3]: [ 73.2         79.7        333.33333333  71.73877551]
prediction for [1.1, 1.2, 1.3, 1.5, 1.5, 1.5, 1.5, 1.5]: [ 73.2         79.7        333.33333333  71.73877551]
Traceback (most recent call last):
  File "test.py", line 13, in <module>
    print(f"prediction for {p_short}: {model.expected_data(p_short)}")
  File "[...]/pyhf/src/pyhf/pdf.py", line 744, in expected_data
    return self.make_pdf(pars).expected_data()
  File "[...]/pyhf/src/pyhf/pdf.py", line 788, in make_pdf
    mainpdf = self.main_model.make_pdf(pars)
  File "[...]/pyhf/src/pyhf/pdf.py", line 546, in make_pdf
    lambdas_data = self.expected_data(pars)
  File "[...]/pyhf/src/pyhf/pdf.py", line 608, in expected_data
    deltas, factors = self._modifications(pars)
  File "[...]/pyhf/src/pyhf/pdf.py", line 573, in _modifications
    [self.modifiers_appliers[k].apply(pars) for k in self._factor_mods],
  File "[...]/pyhf/src/pyhf/pdf.py", line 573, in <listcomp>
    [self.modifiers_appliers[k].apply(pars) for k in self._factor_mods],
  File "[...]/pyhf/src/pyhf/modifiers/shapesys.py", line 169, in apply
    shapefactors = tensorlib.gather(flat_pars, self.access_field)
  File "[...]/pyhf/src/pyhf/tensor/numpy_backend.py", line 177, in gather
    return tensor[indices]
IndexError: index 2 is out of bounds for axis 0 with size 2

The trace to the IndexError is maybe not extremely readable, but might be clear enough to find the reason for it quickly. This behavior could possibly be caught in pyhf.pdf.Model.expected_data with an error message?

The current handling of arguments with too many parameters could be convenient for some cases. What do you think about adding a warning? When removing a single parameter from a model, that parameter is not necessarily the last one. If the user forgets to update the input to expected_data, they get no feedback about the mismatch and possibly end up calculating the wrong thing.

Relevant Issues and Pull Requests

only marginally related: #1031

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user request Request coming form a pyhf user
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants