Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify TemplateSource, CombinedSource and SpectrumTemplateSource #69

Merged
merged 25 commits into from
Aug 16, 2023
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
ec0bc27
Simplify TemplateSource
dachengx Jul 31, 2023
cd0a594
Update settings of utils, add SR2 to demo CombinedSource
dachengx Jul 31, 2023
1f9251b
Minor change
dachengx Jul 31, 2023
789e3f2
Set numpy version to less than 1.23.0
dachengx Jul 31, 2023
e1aac2a
Update docstring to Google style
dachengx Jul 31, 2023
ef53ff8
Check needed weighted parameters of CombinedSource
dachengx Aug 1, 2023
459c93b
Update docstring
dachengx Aug 1, 2023
e1cf100
Merge remote-tracking branch 'origin/main' into simplify_source
dachengx Aug 1, 2023
698e0ba
No need to use multiple inheritance
dachengx Aug 1, 2023
c7252e8
Merge branch 'main' into simplify_source
dachengx Aug 1, 2023
462fb5b
Raise error when there is no key in the json file
dachengx Aug 2, 2023
20b53de
Merge remote-tracking branch 'origin/main' into simplify_source
dachengx Aug 4, 2023
de2044d
Recover logging, update docstring
dachengx Aug 4, 2023
07fb1c0
Kill cache of coverage
dachengx Aug 4, 2023
6fd9c19
Test SpectrumTemplateSource
dachengx Aug 6, 2023
1356abb
Demo future callable uncertainty
dachengx Aug 6, 2023
52c2681
Use latest numpy
dachengx Aug 8, 2023
bf29d49
Rename histogram_multiplier to histogram_scale_factor
dachengx Aug 10, 2023
c2b0cf3
Merge remote-tracking branch 'origin/main' into simplify_source
dachengx Aug 16, 2023
b3955f3
Test template source separately in another test module
dachengx Aug 16, 2023
b063d6e
Remove double quotes if not necessary
dachengx Aug 16, 2023
4f5508a
Forget to upload this file
dachengx Aug 16, 2023
68d2a2a
tiny typo
kdund Aug 16, 2023
72b4e17
tiny docstring change
kdund Aug 16, 2023
acbd7db
Fix little typo
dachengx Aug 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ A tool to perform toyMC-based inference constructions

[![DOI](https://zenodo.org/badge/654100988.svg)](https://zenodo.org/badge/latestdoi/654100988)
[![Test package](https://github.com/XENONnT/alea/actions/workflows/pytest.yml/badge.svg?branch=main)](https://github.com/XENONnT/alea/actions/workflows/pytest.yml)
[![Coverage Status](https://coveralls.io/repos/github/XENONnT/alea/badge.svg?branch=main)](https://coveralls.io/github/XENONnT/alea?branch=main)
[![Coverage Status](https://coveralls.io/repos/github/XENONnT/alea/badge.svg?branch=main&kill_cache=1)](https://coveralls.io/github/XENONnT/alea?branch=main&kill_cache=1)
[![PyPI version shields.io](https://img.shields.io/pypi/v/alea-inference.svg)](https://pypi.python.org/pypi/alea-inference/)
[![Readthedocs Badge](https://readthedocs.org/projects/alea/badge/?version=latest)](https://alea.readthedocs.io/en/latest/?badge=latest)
[![CodeFactor](https://www.codefactor.io/repository/github/xenonnt/alea/badge)](https://www.codefactor.io/repository/github/xenonnt/alea)
Expand Down
10 changes: 10 additions & 0 deletions alea/examples/configs/cs1_spectrum.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"coordinate_system": [
0.0,
100.0
],
"map": [
1.0,
1.0
]
}
53 changes: 45 additions & 8 deletions alea/examples/configs/unbinned_wimp_statistical_model.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,12 @@ parameter_definition:
livetime_sr2:
nominal_value: 0.5
fittable: false
description: Livetime of SR1 in years
description: Livetime of SR2 in years

livetime_sr3:
nominal_value: 1.0
fittable: false
description: Livetime of SR3 in years

wimp_rate_multiplier:
nominal_value: 1.0
Expand Down Expand Up @@ -56,7 +61,7 @@ parameter_definition:
er_band_shift:
nominal_value: 0
ptype: shape
uncertainty: null # 'scipy.stats.uniform(loc=-1, scale=2)'
uncertainty: null # stats.uniform(loc=0, scale=2).logpdf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think making all our tests usig the same large config is not optimal-- could we make smaller likelihoods for each aspects of the test?

# relative_uncertainty: false
fittable: true
blueice_anchors:
Expand All @@ -79,8 +84,8 @@ likelihood_config:
default_source_class: alea.template_source.TemplateSource
likelihood_type: blueice.likelihood.UnbinnedLogLikelihood
analysis_space:
- "cs1": 'np.arange(0, 102, 2)'
- "cs2": 'np.geomspace(100, 100000, 51)'
- cs1: np.linspace(0, 100, 51)
- cs2: np.geomspace(100, 100000, 51)
in_events_per_bin: true
livetime_parameter: livetime_sr0
slice_args: {}
Expand Down Expand Up @@ -110,8 +115,8 @@ likelihood_config:
default_source_class: alea.template_source.TemplateSource
likelihood_type: blueice.likelihood.UnbinnedLogLikelihood
analysis_space:
- "cs1": 'np.arange(0, 102, 2)'
- "cs2": 'np.geomspace(100, 100000, 51)'
- cs1: np.linspace(0, 100, 51)
- cs2: np.geomspace(100, 100000, 51)
in_events_per_bin: true
livetime_parameter: livetime_sr1
slice_args: {}
Expand Down Expand Up @@ -141,8 +146,8 @@ likelihood_config:
default_source_class: alea.template_source.TemplateSource
likelihood_type: blueice.likelihood.UnbinnedLogLikelihood
analysis_space:
- "cs1": 'np.arange(0, 102, 2)'
- "cs2": 'np.geomspace(100, 100000, 51)'
- cs1: np.linspace(0, 100, 51)
- cs2: np.geomspace(100, 100000, 51)
in_events_per_bin: true
livetime_parameter: livetime_sr2
slice_args: {}
Expand Down Expand Up @@ -170,3 +175,35 @@ likelihood_config:
template_filename: wimp50gev_template.h5
apply_efficiency: True
efficiency_name: signal_efficiency

# SR3
- name: sr3
default_source_class: alea.template_source.TemplateSource
likelihood_type: blueice.likelihood.UnbinnedLogLikelihood
analysis_space:
- cs1: np.linspace(0, 100, 51)
- cs2: np.geomspace(100, 100000, 51)
in_events_per_bin: true
livetime_parameter: livetime_sr3
slice_args: {}
sources:
- name: er
histname: er_template
parameters:
- er_rate_multiplier
- er_band_shift
named_parameters:
- er_band_shift
template_filename: er_template_{er_band_shift}.h5

- name: wimp
class: alea.template_source.SpectrumTemplateSource
histname: wimp_template
parameters:
- wimp_rate_multiplier
- wimp_mass
- signal_efficiency
template_filename: wimp50gev_template.h5
spectrum_name: cs1_spectrum.json
apply_efficiency: True
efficiency_name: signal_efficiency
21 changes: 11 additions & 10 deletions alea/parameters.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import warnings
from typing import Any, Dict, List, Optional, Tuple

import scipy
from alea.utils import within_limits, clip_limits


class Parameter:
Expand Down Expand Up @@ -46,8 +46,8 @@ def __init__(
self.nominal_value = nominal_value
self.fittable = fittable
self.ptype = ptype
self.uncertainty = uncertainty
self.relative_uncertainty = relative_uncertainty
self.uncertainty = uncertainty
self.blueice_anchors = blueice_anchors
self.fit_limits = fit_limits
self.parameter_interval_bounds = parameter_interval_bounds
Expand All @@ -66,21 +66,22 @@ def __repr__(self) -> str:
def uncertainty(self) -> float or Any:
"""
Return the uncertainty of the parameter.
If the uncertainty is a string, it can be evaluated as a numpy or scipy function.
"""
if isinstance(self._uncertainty, str):
# Evaluate the uncertainty if it's a string starting with "scipy." or "numpy."
if self._uncertainty.startswith("scipy.") or self._uncertainty.startswith("numpy."):
return eval(self._uncertainty)
else:
raise ValueError(
f"Uncertainty string '{self._uncertainty}'"
" must start with 'scipy.' or 'numpy.'")
NotImplementedError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the challenge in not including this functionality for now?

"Only float uncertainties are supported at the moment.")
else:
return self._uncertainty

@uncertainty.setter
def uncertainty(self, value: float or str) -> None:
"""
Uncertainty can be a float or a string, but can only be a float if relative_uncertainty
"""
if self.relative_uncertainty and isinstance(value, str):
raise ValueError(
f"relative_uncertainty is not supported for "
f"string uncertainties of {self.name}.")
self._uncertainty = value

@property
Expand Down
129 changes: 66 additions & 63 deletions alea/template_source.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@
from blueice import HistogramPdfSource
from inference_interface import template_to_multihist

logging.basicConfig(level=logging.debug)
from alea.utils import get_file_path

logging.basicConfig(level=logging.INFO)
can_check_binning = True


Expand Down Expand Up @@ -122,49 +124,44 @@ def apply_slice_args(self, h, slice_args=None):
Args:
h (multihist.MultiHistBase): The histogram to apply the slice arguments to.
slice_args (dict): The slice arguments to apply.
The sum_axis, slice_axis, and slice_axis_limits are supported.
"""
if slice_args is None:
slice_args = self.config.get("slice_args", {})
if isinstance(slice_args, dict):
slice_args = [slice_args]

for sa in slice_args:
sum_axis = sa.get("sum_axis", None)
slice_axis = sa.get("slice_axis", None)
# Decide if you wish to sum the histogram into lower dimensions or
sum_axis = sa.get("sum_axis", False)
collapse_axis = sa.get("collapse_axis", None)
collapse_slices = sa.get("collapse_slices", None)
if slice_axis is not None:
slice_axis_number = h.get_axis_number(slice_axis)
bes = h.bin_edges[slice_axis_number]
# When slice_axis is None, slice_axis_limits is not used
bin_edges = h.bin_edges[h.get_axis_number(slice_axis)]
slice_axis_limits = sa.get(
"slice_axis_limits", [bes[0], bes[-1]])
if sum_axis:
logging.debug(
f"Slice and sum over axis {slice_axis} from {slice_axis_limits[0]} "
f"to {slice_axis_limits[1]}")
axis_names = h.axis_names_without(slice_axis)
h = h.slicesum(
axis=slice_axis,
start=slice_axis_limits[0],
stop=slice_axis_limits[1])
h.axis_names = axis_names
else:
logging.debug(f"Normalization before slicing: {h.n}.")
logging.debug(
f"Slice over axis {slice_axis} from {slice_axis_limits[0]} "
f"to {slice_axis_limits[1]}")
h = h.slice(
axis=slice_axis,
start=slice_axis_limits[0],
stop=slice_axis_limits[1])
logging.debug(f"Normalization after slicing: {h.n}.")

if collapse_axis is not None:
if collapse_slices is None:
raise ValueError(
"To collapse you must supply collapse_slices")
h = h.collapse_bins(collapse_slices, axis=collapse_axis)
"slice_axis_limits", [bin_edges[0], bin_edges[-1]])
# sum and/or slice the histogram
if (sum_axis is not None) and (slice_axis is not None):
logging.debug(
f"Slice and sum over axis {slice_axis} from {slice_axis_limits[0]} "
f"to {slice_axis_limits[1]}")
h = h.slicesum(
start=slice_axis_limits[0],
stop=slice_axis_limits[1],
axis=slice_axis)
elif sum_axis is not None:
logging.debug(
f"Sum over axis {sum_axis}")
h = h.sum(axis=sum_axis)
elif slice_axis is not None:
logging.debug(
f"Slice over axis {slice_axis} from {slice_axis_limits[0]} "
f"to {slice_axis_limits[1]}")
logging.debug(f"Normalization before slicing: {h.n}.")
h = h.slice(
start=slice_axis_limits[0],
stop=slice_axis_limits[1],
axis=slice_axis)
logging.debug(f"Normalization after slicing: {h.n}.")
return h

def set_dtype(self):
dachengx marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -176,7 +173,7 @@ def set_dtype(self):

def set_pdf_histogram(self, h):
dachengx marked this conversation as resolved.
Show resolved Hide resolved
"""Set the histogram of the probability density function of the source."""
self._bin_volumes = h.bin_volumes() # TODO: make alias
self._bin_volumes = h.bin_volumes()
# Should not be in HistogramSource... anyway
self._n_events_histogram = h.similar_blank_histogram()

Expand Down Expand Up @@ -216,18 +213,16 @@ def simulate(self, n_events):
numpy.ndarray: The simulated events.
"""
ret = np.zeros(n_events, dtype=self.dtype)
# t = self._pdf_histogram.get_random(n_events)
h = self._pdf_histogram * self._bin_volumes
t = h.get_random(n_events)
for i, (n, _) in enumerate(self.config["analysis_space"]):
ret[n] = t[:, i]
return ret


class CombinedSource(TemplateSource, HistogramPdfSource):
class CombinedSource(TemplateSource):
"""
Source that inherits structure from TH2DSource by Jelle,
but takes in lists of histogram names.
Source combines in lists of histogram names.
The first histogram is the base histogram, and the rest are added to it with weights.
The weights can be set as shape parameters in the config.

Expand All @@ -243,7 +238,7 @@ def build_histogram(self):
if not self.config.get("in_events_per_bin", True):
raise ValueError(
"CombinedSource does not support in_events_per_bin=False")
# check if all the necessary parameters, like weights are specified
# Check if all the necessary parameters, like weights are specified
if "weight_names" not in self.config:
raise ValueError("weight_names must be specified")
find_weight = [weight_name in self.config for weight_name in self.config["weight_names"]]
Expand Down Expand Up @@ -302,8 +297,7 @@ def build_histogram(self):
hsliced = h_comp.get_axis_bin_index(bincs[0], j)
hsliceu = h_comp.get_axis_bin_index(bincs[-1], j) + 1
hslices.append(slice(hsliced, hsliceu, 1))
# TODO: check the normalization here.
h_comp[hslices] += h.histogram
h_comp[tuple(hslices)] += h.histogram
histograms[0] += h_comp * weights[i]
h = histograms[0]

Expand Down Expand Up @@ -334,32 +328,37 @@ def build_histogram(self):
self.set_pdf_histogram(h)


class SpectrumTemplateSource(TemplateSource, HistogramPdfSource):
class SpectrumTemplateSource(TemplateSource):
"""
Reweighted template source by energy spectrum.
The first axis of the template is assumed to be energy.
Reweighted template source by 1D spectrum.
The first axis of the template is assumed to be reweighted.

Args:
spectrum_name:
Name of bbf json-like spectrum _OR_ function that can be called
templatename #3D histogram (Etrue, S1, S2) to open
Name of bbf json-like spectrum file
"""

@staticmethod
def _get_json_spectrum(filename):
"""
Translates bbf-style JSON files to spectra.
units are keV and /kev*day*kg
Translates bbf-style JSON files to spectra

Args:
filename (str): Name of the JSON file.

Todo:
Define the format of the JSON file clearly.
"""
with open(filename, "r") as f:
with open(get_file_path(filename), "r") as f:
contents = json.load(f)
logging.debug(contents["description"])
esyst = contents["coordinate_system"][0][1]
if "description" in contents:
logging.debug(contents["description"])
if "coordinate_system" not in contents:
raise ValueError("Coordinate system not in JSON file.")
if "map" not in contents:
raise ValueError("Map not in JSON file.")
ret = interp1d(
np.linspace(*esyst), contents["map"],
contents["coordinate_system"], contents["map"],
bounds_error=False, fill_value=0.)
return ret

Expand All @@ -369,16 +368,20 @@ def build_histogram(self):
histname = self.config["histname"].format(**self.format_named_parameters)
h = template_to_multihist(templatename, histname)

spectrum = self.config["spectrum"]
if isinstance(spectrum, str):
spectrum = self._get_json_spectrum(spectrum.format(**self.format_named_parameters))

# Perform E-scaling, assume first axis is energy
ecenters = h.bin_centers[0]
ediffs = h.bin_volumes[0]
h.histogram = h.histogram * (spectrum(ecenters) * ediffs)[:, None, None]
h = h.sum(axis=0) # Remove energy-axis
logging.debug("source", "mu ", h.n)
if "spectrum_name" not in self.config:
raise ValueError("spectrum_name not in config")
spectrum = self._get_json_spectrum(
self.config["spectrum_name"].format(**self.format_named_parameters))

# Perform scaling, the first axis is assumed to be reweighted
# The spectrum is assumed to be probability density (in per the unit of first axis).
axis = 0
# h = h.normalize(axis=axis)
bin_edges = h.bin_edges[axis]
bin_centers = h.bin_centers(axis=axis)
slices = [None] * h.histogram.ndim
slices[axis] = slice(None)
h.histogram = h.histogram * (spectrum(bin_centers) * np.diff(bin_edges))[tuple(slices)]

if np.min(h.histogram) < 0:
raise AssertionError(
Expand Down
Loading
Loading