feat: Add toy calculator, empirical distribution, and toy example notebook #790

lukasheinrich · 2020-03-04T21:38:35Z

Description

this adds the toy calculator.

Needs: #1147.

ReadTheDocs build:

Checklist Before Requesting Reviewer

Tests are passing
Verified notebook workflow tests are passing
"WIP" removed from the title of the pull request
Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

Summarize commit messages into a comprehensive review of the PR

* Add EmpiricalDistribution and ToyCalculator classes
* Move top level infer functions under 'infer.calculators'
* Add tests for EmpiricalDistribution and ToyCalculator
* Add EmpiricalDistribution and ToyCalculator to docs
* Add example notebook on how to use the ToyCalculator

Co-authored-by: Giordon Stark <[email protected]>
Co-authored-by: Matthew Feickert <[email protected]>

src/pyhf/infer/calculators.py

lgtm-com · 2020-03-05T16:04:51Z

This pull request introduces 1 alert when merging 74161b7 into 49ab2a4 - view on LGTM.com

new alerts:

1 for First argument to super() is not enclosing class

lgtm-com · 2020-03-05T16:23:25Z

This pull request introduces 1 alert when merging 7425f01 into 49ab2a4 - view on LGTM.com

new alerts:

1 for First argument to super() is not enclosing class

src/pyhf/infer/mixins.py

lgtm-com · 2020-03-17T18:32:35Z

This pull request introduces 1 alert when merging b24726f into ed483f4 - view on LGTM.com

new alerts:

1 for First argument to super() is not enclosing class

src/pyhf/infer/mixins.py

codecov · 2020-03-18T17:31:49Z

Codecov Report

Merging #790 into master will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #790      +/-   ##
==========================================
+ Coverage   97.14%   97.18%   +0.04%     
==========================================
  Files          62       63       +1     
  Lines        3611     3663      +52     
  Branches      521      523       +2     
==========================================
+ Hits         3508     3560      +52     
  Misses         64       64              
  Partials       39       39

Flag	Coverage Δ
#unittests	`97.18% <100.00%> (+0.04%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/pyhf/cli/infer.py	`97.64% <100.00%> (+0.02%)`	⬆️
src/pyhf/infer/__init__.py	`100.00% <100.00%> (ø)`
src/pyhf/infer/calculators.py	`100.00% <100.00%> (ø)`
src/pyhf/infer/utils.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 69a5388...2c9ca25. Read the comment docs.

matthewfeickert

I haven't reviewed the notebook yet, but things look good here. I've added nitpick comments RE: things rendering our correctly in the docs. The main thing that needs to get added is tests as the coverage on the patch is quite low.

This will be really nice to have in, so thanks for making this.

src/pyhf/infer/calculators.py

src/pyhf/infer/utils.py

matthewfeickert · 2020-03-31T04:05:31Z

@kratsg I made the following edits to the notebook and have some questions. I'm happy to discuss or roll back any of them.

Instead of using pylab just import numpy and matplotlib. I'm not really a big fan of pylab and I'd prefer to be very explicit to users what is happening.
Mention that mu' is the signal strength of the model being hypothesis being considered
Compute the Poisson uncertainty so that if the model were to change it is updated correctly

background_uncertainty = int(np.sqrt(background))

@lukasheinrich I think you didn't like this originally though, correct? It seems like this makes it easier for someone to experiment with the notebook.

Once the pyhf model is introduced switch to code formatting for channels and signal strengths to make it clear that we're connecting to the model information printed directly above.
I'd suggest first using the word "pseudo-experiments" which is more generally known, and then say '(or "toys" as particle physicists would say)'). Also introduce the term "throwing toys" ala Glen and Kyle.
Set a parameter to control the number of samples (n_samples)
This means to sample N=1000 from the pdfs

Shouldn't this be"N=2000" given the numbers that show up in the notebook?

Given the API choice in the PR (introducing utils.create_calculator) need to change to using pyhf.infer.utils.create_calculator
For now, we will create a toy-based calculator and evaluate it for data expected if the background-only hypothesis were true ( 𝜇=1.0 )

Shouldn't this be "the nominal signal+background model were true"? This is q_1/\tilde{q}_1. Or have I confused myself with notation again?

Increased the font size of the plots

plt.rcParams.update({"font.size": 14})

Set the x-axes tick marks to match Figure 5 of the Asymptoics paper more.
Make the plots larger (was 12,5) but keep spacing

fig.set_size_inches(13.5,6)
fig.tight_layout(pad=2.0)

Add titles to the plots
Add vertical-axis labels
Ran Black over it

Let me know your thoughts and if you disagree with anything that I can fix up. I think this is a really nice notebook and I'm very excited that we finally have this made now.

kratsg · 2020-03-31T04:34:08Z

Compute the Poisson uncertainty so that if the model were to change it is updated correctly
background_uncertainty = int(np.sqrt(background))

we want to set the uncertainty explicitly. Revert this.

Shouldn't this be"N=2000" given the numbers that show up in the notebook?

should be 2k.

Given the API choice in the PR (introducing utils.create_calculator) need to change to using pyhf.infer.utils.create_calculator

no. pyhf.infer.__init__ imports this correctly. utils are meant to be used internally (imo).

For now, we will create a toy-based calculator and evaluate it for data expected if the background-only hypothesis were true ( 𝜇=1.0 )

Shouldn't this be "the nominal signal+background model were true"? This is q_1/\tilde{q}_1. Or have I confused myself with notation again?

See chat with Lukas offline.

Make the plots larger (was 12,5) but keep spacing

you lose the square ratio. Should be at least 14,6 if you're doing that.

Increased the font size of the plots
plt.rcParams.update({"font.size": 14})

this is nitpicking for the notebook and should be removed.

Set the x-axes tick marks to match Figure 5 of the Asymptoics paper more.

nitpicking as well. should be removed.

```python
fig.tight_layout(pad=2.0)

we don't need tight layout do we? the artists fit fine in the boundary box by default.

matthewfeickert · 2020-03-31T04:55:38Z

we want to set the uncertainty explicitly. Revert this.

Done

no. pyhf.infer.__init__ imports this correctly. utils are meant to be used internally (imo).

Seem we have a problem then

For now, we will create a toy-based calculator and evaluate it for data expected if the background-only hypothesis were true ( 𝜇=1.0 )

Shouldn't this be "the nominal signal+background model were true"? This is q_1/\tilde{q}_1. Or have I confused myself with notation again?

See chat with Lukas offline.

The Skype chat still doesn't make sense. I'll follow up again.

you lose the square ratio. Should be at least 14,6 if you're doing that.

Done. Changed to 14,6 .

Increased the font size of the plots
plt.rcParams.update({"font.size": 14})
this is nitpicking for the notebook and should be removed.

The font is pretty small without it. :/

Set the x-axes tick marks to match Figure 5 of the Asymptoics paper more.

nitpicking as well. should be removed.

Is the spacing too tight?

we don't need tight layout do we? the artists fit fine in the boundary box by default.

I think that these plots can still be confusing (even though there are the nice plot legends) without axis labels. If you don't set padding and apply a vertical-axis label then the right hand plot's label gets scrunched up against the left hand side. The padding solves this.

kratsg · 2020-03-31T17:48:20Z

Seem we have a problem then

I'm realizing that create_calculator shouldn't be exposed by default unless people want to do more advanced things. The interface via pyhf.infer.hypotest exists, so this is good.

kratsg · 2020-03-31T17:49:04Z

The font is pretty small without it. :/
Is the spacing too tight?

ok. I guess as long as it's defined at the top of the notebook, then it's fine.

matthewfeickert · 2020-03-31T17:59:56Z

This is more for me, but this is a nbviewer render of the toys notebook in this PR: (also weirdly you have to set flush_cache to false to get it to update).

src/pyhf/infer/calculators.py

matthewfeickert

@lukasheinrich @kratsg Unless there is anything else that you want to go in this, I think that once we get examples in the docstrings that this PR can probably get reviewed a final time and go in. That is, unless you want to try to at the end split off the notebook into its own PR.

src/pyhf/infer/utils.py

tests/test_notebooks.py

matthewfeickert · 2020-10-28T18:42:25Z

@lukasheinrich @kratsg I think we can review this now. Also, as ReviewNB wasn't added to the repo before this PR was started here's a nbviewer render of the toys notebook in this PR.

matthewfeickert

thank you @lukasheinrich for your work on this and a huge amount of patience.

matthewfeickert assigned lukasheinrich Mar 4, 2020

matthewfeickert added feat/enhancement New feature or request API Changes the public API labels Mar 4, 2020

lukasheinrich commented Mar 5, 2020

View reviewed changes

src/pyhf/infer/calculators.py Show resolved Hide resolved

lukasheinrich commented Mar 5, 2020

View reviewed changes

src/pyhf/infer/calculators.py Show resolved Hide resolved

kratsg force-pushed the toycalc branch from 47f5bba to 74161b7 Compare March 5, 2020 15:57

kratsg force-pushed the toycalc branch from 74161b7 to 7425f01 Compare March 5, 2020 16:16

lukasheinrich commented Mar 17, 2020

View reviewed changes

src/pyhf/infer/mixins.py Outdated Show resolved Hide resolved

lukasheinrich commented Mar 18, 2020

View reviewed changes

src/pyhf/infer/mixins.py Outdated Show resolved Hide resolved

kratsg force-pushed the toycalc branch 2 times, most recently from bac9c58 to 5479452 Compare March 18, 2020 15:39

lukasheinrich changed the title ~~[WIP] toy calculator~~ feat: toy calculator Mar 18, 2020

kratsg mentioned this pull request Mar 18, 2020

Parallelism of calculations in pyhf ala joblib (or similar) #807

Open

kratsg force-pushed the toycalc branch from 160072f to 3b87eab Compare March 30, 2020 13:44

matthewfeickert reviewed Mar 30, 2020

View reviewed changes

matthewfeickert reviewed Mar 31, 2020

View reviewed changes

src/pyhf/infer/calculators.py Outdated Show resolved Hide resolved

matthewfeickert added tests pytest docs Documentation related labels Apr 1, 2020

matthewfeickert reviewed Apr 1, 2020

View reviewed changes

matthewfeickert mentioned this pull request Apr 1, 2020

docs / naming: test statistics #559

Open

kratsg added 7 commits October 28, 2020 07:32

fix imports

e7662e7

fix up spacing

b937fbb

actually test toys notebook

356f82a

fix import

a522bd0

fix doctest

bbc08ad

add test for coverage in backends

6e29fbd

switch to tensorlib backend

6efc1b1

kratsg force-pushed the toycalc branch from 9dc255e to 6efc1b1 Compare October 28, 2020 11:32

qtilde=True default

92ad033

matthewfeickert reviewed Oct 28, 2020

View reviewed changes

src/pyhf/infer/utils.py Outdated Show resolved Hide resolved

matthewfeickert added 4 commits October 28, 2020 11:39

Use :obj:

507d1a8

Keep docstrings to 80 chars when possible

f5a683e

Use pyhf.infer.test_statistics.qmu_tilde API

61e6c12

Set qtilde for the two toy calculators

6fb6769

matthewfeickert reviewed Oct 28, 2020

View reviewed changes

src/pyhf/infer/utils.py Outdated Show resolved Hide resolved

matthewfeickert reviewed Oct 28, 2020

View reviewed changes

tests/test_notebooks.py Outdated Show resolved Hide resolved

matthewfeickert added 2 commits October 28, 2020 12:23

Explicitly give qtilde option in example

c5a2bd5

Add comment to relevant PR for adding percentile

6e3dca9

matthewfeickert requested a review from kratsg October 28, 2020 18:38

kratsg added 2 commits October 28, 2020 15:02

add :obj:

677603b

remove slow

2c9ca25

kratsg approved these changes Oct 28, 2020

View reviewed changes

matthewfeickert approved these changes Oct 28, 2020

View reviewed changes

kratsg merged commit 81c9adb into master Oct 28, 2020

kratsg deleted the toycalc branch October 28, 2020 20:36

matthewfeickert mentioned this pull request Oct 28, 2020

Enable different sigmas for signal and background #1155

Open

matthewfeickert mentioned this pull request Mar 1, 2022

Additional documentation needed on difference between calculator type test statistic return type #1792

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add toy calculator, empirical distribution, and toy example notebook #790

feat: Add toy calculator, empirical distribution, and toy example notebook #790

lukasheinrich commented Mar 4, 2020 •

edited by matthewfeickert

Loading

lgtm-com bot commented Mar 5, 2020

lgtm-com bot commented Mar 5, 2020

lgtm-com bot commented Mar 17, 2020

codecov bot commented Mar 18, 2020 •

edited

Loading

matthewfeickert left a comment

matthewfeickert commented Mar 31, 2020 •

edited

Loading

kratsg commented Mar 31, 2020

matthewfeickert commented Mar 31, 2020

kratsg commented Mar 31, 2020

kratsg commented Mar 31, 2020

matthewfeickert commented Mar 31, 2020 •

edited

Loading

matthewfeickert left a comment

matthewfeickert commented Oct 28, 2020

matthewfeickert left a comment

feat: Add toy calculator, empirical distribution, and toy example notebook #790

feat: Add toy calculator, empirical distribution, and toy example notebook #790

Conversation

lukasheinrich commented Mar 4, 2020 • edited by matthewfeickert Loading

Description

Checklist Before Requesting Reviewer

Before Merging

lgtm-com bot commented Mar 5, 2020

lgtm-com bot commented Mar 5, 2020

lgtm-com bot commented Mar 17, 2020

codecov bot commented Mar 18, 2020 • edited Loading

Codecov Report

matthewfeickert left a comment

Choose a reason for hiding this comment

matthewfeickert commented Mar 31, 2020 • edited Loading

kratsg commented Mar 31, 2020

matthewfeickert commented Mar 31, 2020

kratsg commented Mar 31, 2020

kratsg commented Mar 31, 2020

matthewfeickert commented Mar 31, 2020 • edited Loading

matthewfeickert left a comment

Choose a reason for hiding this comment

matthewfeickert commented Oct 28, 2020

matthewfeickert left a comment

Choose a reason for hiding this comment

lukasheinrich commented Mar 4, 2020 •

edited by matthewfeickert

Loading

codecov bot commented Mar 18, 2020 •

edited

Loading

matthewfeickert commented Mar 31, 2020 •

edited

Loading

matthewfeickert commented Mar 31, 2020 •

edited

Loading