Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Michaelgemmell/bayes workflow #262

Merged
merged 166 commits into from
Aug 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
166 commits
Select commit Hold shift + click to select a range
b7425f1
initial commit
michael-gemmell Mar 28, 2023
aae4c47
magnetic recon tests
michael-gemmell Mar 28, 2023
3100262
use error attribute if it is given
michael-gemmell Mar 30, 2023
ba574ec
Added poisson noise and adjusted calibration factor
michael-gemmell Mar 30, 2023
d5fec79
changed name magnetic_recon -> equilibrium_reconstruction
michael-gemmell Mar 30, 2023
56c769a
diagnostic data with channel dim are treated as individual measurements
michael-gemmell Mar 30, 2023
91d7e66
added function of example LOS
michael-gemmell Mar 31, 2023
37cd733
refactoring
michael-gemmell Mar 31, 2023
6dbebc6
refactoring
michael-gemmell Mar 31, 2023
57c6a66
refactoring
michael-gemmell Mar 31, 2023
ac904b6
initial commit
michael-gemmell Mar 31, 2023
4fa0bb1
removed obsolete attributes
michael-gemmell Mar 31, 2023
4a5c5b3
Setting up testflow
michael-gemmell Mar 31, 2023
31f306b
more tests
michael-gemmell Mar 31, 2023
66e6b22
more test stubs
michael-gemmell Mar 31, 2023
b9e61c2
small fix
michael-gemmell Apr 12, 2023
1e51df8
Merge branch 'michaelgemmell/magnetic_recon_diagnostic' into michaelg…
michael-gemmell Apr 25, 2023
ccfe635
cosmetic changes
michael-gemmell Apr 25, 2023
c8cd11b
changed marker style
michael-gemmell Apr 25, 2023
b386b48
merged with st40
michael-gemmell Apr 25, 2023
dced6d5
Merge branch 'st40' into michaelgemmell/bayes_dev
michael-gemmell Apr 27, 2023
5655cef
small fix
michael-gemmell Apr 27, 2023
40e5cb5
adjusting alpha
michael-gemmell Apr 27, 2023
60f26f9
Merge branch 'st40' into michaelgemmell/bayes_dev
michael-gemmell May 3, 2023
210bf7c
black reformatting
michael-gemmell May 3, 2023
efe2281
fixing formatting
michael-gemmell May 3, 2023
7ac95c4
flake8 fix
michael-gemmell May 3, 2023
cea6161
flake8 fix
michael-gemmell May 3, 2023
5baea04
fixing merge conflicts
michael-gemmell May 3, 2023
0aac384
boolean indexing -> xarray.where in call
michael-gemmell May 3, 2023
0259f22
fixing tests
michael-gemmell May 3, 2023
8cf4567
removing import
michael-gemmell May 3, 2023
8170cdd
precommit fixes
michael-gemmell May 3, 2023
96f1f5c
stashing for quick result
michael-gemmell May 12, 2023
7e1fda1
added calc_impurity for generating profiles from concentrations to up…
michael-gemmell May 22, 2023
402cd2d
added conditional handling of diagnostics in data pipeline
michael-gemmell May 22, 2023
fb7704f
moved calc_impurity to plasma method
michael-gemmell May 22, 2023
4997194
tweaking calibration factor
michael-gemmell May 22, 2023
c8e821f
tweaking calibration factor
michael-gemmell May 22, 2023
5bb0b51
tweaking calibration factor
michael-gemmell May 22, 2023
457da6b
changed burn_in to burn_in_fraction
michael-gemmell May 23, 2023
4b88ffe
added violinplots
michael-gemmell May 23, 2023
86e3784
removed comment
michael-gemmell May 23, 2023
eb13c00
included dictionary logic for fitting impurities scaled by concentrat…
michael-gemmell May 23, 2023
8987311
changed name cxff_pi.ti_0d to cxff_pi.ti0
michael-gemmell May 23, 2023
a9b6b18
new option for initialising at maximum-likelihood estimate
michael-gemmell May 24, 2023
1fa8427
added xlim for violin plot
michael-gemmell May 24, 2023
db2d0e1
tweaking calibration
michael-gemmell May 31, 2023
ce73ee5
fixed neutral density units
michael-gemmell May 31, 2023
eda445a
added set plot functions
michael-gemmell May 31, 2023
2555605
added center mass sampling and Ti/Te y0_ref options
michael-gemmell May 31, 2023
20ae704
plot formatting changed
michael-gemmell May 31, 2023
e0b9c03
black formatting
michael-gemmell May 31, 2023
dfabdd0
deleted example bayes_opt
michael-gemmell May 31, 2023
b21b66d
moved sample with autocorr function to bayes_dev_workflow
michael-gemmell May 31, 2023
2f78d9d
renamed bayesworkflow
michael-gemmell May 31, 2023
9ed488e
stashing
michael-gemmell Jun 13, 2023
533a44a
multiplot formatting
michael-gemmell Jun 22, 2023
751ca81
generalised writing kinetic profiles to blobs
michael-gemmell Jun 27, 2023
59c8316
function autocorr plot added
michael-gemmell Jun 27, 2023
b2dee2d
ion density added to kinetic profiles saved
michael-gemmell Jun 27, 2023
8eee24e
inital commit of abstract workflow and example of usage
michael-gemmell Jun 27, 2023
408b619
doc strings
michael-gemmell Jun 27, 2023
de237dd
renaming workflow
michael-gemmell Jun 27, 2023
06c0746
Ti_ref as default
michael-gemmell Jun 30, 2023
5435769
abstract methods added
michael-gemmell Jun 30, 2023
ee3a62e
Changed to new MDSPlus structure
michael-gemmell Jul 4, 2023
a8ca5fa
Adding default methods
michael-gemmell Jul 4, 2023
6f1e330
moved sampling to it's own function in abstract class
michael-gemmell Jul 4, 2023
dd3abdf
moved sampling to it's own function in abstract class
michael-gemmell Jul 4, 2023
6f9d4a8
renamed abstract class to AbstractBayesWorkflow
michael-gemmell Jul 4, 2023
55e2f8f
comments
michael-gemmell Jul 4, 2023
d3bf315
added high density sampling as method
michael-gemmell Jul 4, 2023
430a44b
added attributes to __init__
michael-gemmell Jul 4, 2023
d633ae5
error handling __init__
michael-gemmell Jul 4, 2023
35b64cf
error handling __init__
michael-gemmell Jul 4, 2023
633b36c
moved read data to abstract class
michael-gemmell Jul 4, 2023
2ce1c58
initial commit
michael-gemmell Jul 7, 2023
08cf081
kwargs added
michael-gemmell Jul 7, 2023
82eb00b
xrcs wavelength units corrected
michael-gemmell Jul 7, 2023
666387b
background to int
michael-gemmell Jul 7, 2023
3690efb
stash
michael-gemmell Jul 11, 2023
3c23661
renamed impurities
michael-gemmell Jul 11, 2023
bf3bfb9
stashing
michael-gemmell Jul 12, 2023
fbfafe5
stashing
michael-gemmell Jul 13, 2023
9f62a78
stashing
michael-gemmell Jul 14, 2023
3828c49
fixing key names
michael-gemmell Jul 14, 2023
310a440
updating and testing example
michael-gemmell Jul 14, 2023
d8f9be5
added background as attribute of helike model
michael-gemmell Jul 14, 2023
1f0a0f9
fixed phantom methods
michael-gemmell Jul 14, 2023
7623a54
rearranged methods
michael-gemmell Jul 14, 2023
b3b7a26
moved background to call kwarg
michael-gemmell Aug 2, 2023
4a7e022
params and kwargs now are given as model_{var} and model prefix is re…
michael-gemmell Aug 2, 2023
e78f638
adding _phantom_data and _exp_data abstract methods
michael-gemmell Aug 2, 2023
82d6fee
renaming to BayesWorkflowExample
michael-gemmell Aug 2, 2023
2c1c48a
initial commit
michael-gemmell Aug 2, 2023
c7a2ca5
renamed phantom_params to profile_params
michael-gemmell Aug 2, 2023
db25e70
renamed bayesopt -> bayesmodel
michael-gemmell Aug 2, 2023
b1c5bd3
renamed bayesopt -> bayesmodel
michael-gemmell Aug 2, 2023
f78ce64
moved start point sampling to its own method _sample_start_points
michael-gemmell Aug 2, 2023
c3a418b
fixed variable name
michael-gemmell Aug 2, 2023
196eb60
fixed bug with start_points being overwritten
michael-gemmell Aug 2, 2023
31542b0
removed workflow_dev
michael-gemmell Aug 8, 2023
cc218b4
moving equilibrium to workflow object
michael-gemmell Aug 8, 2023
9cb0266
when reading raw data save transforms to their own attribute
michael-gemmell Aug 8, 2023
68c1872
transforms saved to workflow class
michael-gemmell Aug 8, 2023
f0f561f
fixed name
michael-gemmell Aug 8, 2023
3b2eaee
renamed example transform
michael-gemmell Aug 8, 2023
ae200d3
made example los function
michael-gemmell Aug 8, 2023
3d9d540
minor name fix
michael-gemmell Aug 8, 2023
cd04fb1
made read_test_data function
michael-gemmell Aug 8, 2023
dfdd791
made example los function
michael-gemmell Aug 8, 2023
4c6c1c2
minor type
michael-gemmell Aug 8, 2023
0ce85fa
added reader to read_test_data method
michael-gemmell Aug 8, 2023
fa5ac44
fixed copying workflow object states
michael-gemmell Aug 8, 2023
7d343a4
updated example to work with assign_profiles
michael-gemmell Aug 8, 2023
489d14a
can now run with pulse = None and synthetic transforms/equilibrium
michael-gemmell Aug 8, 2023
e04fe9b
formatting
michael-gemmell Aug 8, 2023
6fcff75
refactoring names
michael-gemmell Aug 8, 2023
eba8ca7
deleted old bayes_models tests
michael-gemmell Aug 8, 2023
b6ece95
reformatted _build_bckc for readability
michael-gemmell Aug 8, 2023
2bd93bf
moved percentage error to class attribute
michael-gemmell Aug 8, 2023
e46ac8a
fixing init percent error
michael-gemmell Aug 8, 2023
cc0aafd
black formatting
michael-gemmell Aug 8, 2023
acc50b5
renamed kin_prof -> plasma_profiles
michael-gemmell Aug 10, 2023
223a526
refactored window handling
michael-gemmell Aug 10, 2023
c9333ae
replaced doppler_broaden with physics.ev_doppler
michael-gemmell Aug 10, 2023
80bc391
removed methods from __init__
michael-gemmell Aug 10, 2023
3f5c089
stashing
michael-gemmell Aug 11, 2023
766825a
setup_plasma now takes kwargs
michael-gemmell Aug 15, 2023
cb93ae8
adding plasma to models now happens in setup_opt_data
michael-gemmell Aug 15, 2023
cf1c127
workflow broken up into methods
michael-gemmell Aug 15, 2023
3019e45
Merge branch 'st40' into michaelgemmell/bayes_workflow
michael-gemmell Aug 15, 2023
260a515
removed redundant kwargs
michael-gemmell Aug 15, 2023
0821b9d
fixed non_plasma call option
michael-gemmell Aug 15, 2023
78800c6
fixed plasma_initialisation
michael-gemmell Aug 15, 2023
62d6727
fixing violin plotting
michael-gemmell Aug 15, 2023
98190dd
units for xrcs.spectra.wavelength fixed
michael-gemmell Aug 15, 2023
f732da0
nchannels added to example_los
michael-gemmell Aug 17, 2023
ec1c9ca
adding print message for fake data reading
michael-gemmell Aug 17, 2023
b0fac37
adjusting example los
michael-gemmell Aug 17, 2023
03d68c7
bckc method not printing spectra/fit not available everytime
michael-gemmell Aug 17, 2023
72e6e7e
adjusting priors
michael-gemmell Aug 17, 2023
d5ebe08
moved kwargs to sample function
michael-gemmell Aug 17, 2023
056b7f5
nsamples kwarg added to sample_from_high_density_region
michael-gemmell Aug 17, 2023
e1b35c5
mocked bda_tree module
michael-gemmell Aug 17, 2023
3733bc6
fixed import
michael-gemmell Aug 17, 2023
b0f3fae
fixed import
michael-gemmell Aug 17, 2023
ad9400a
black formatting
michael-gemmell Aug 17, 2023
f0a1d65
black formatting
michael-gemmell Aug 17, 2023
3b1dcd8
fixed where mocked module is imported
michael-gemmell Aug 17, 2023
b8b6cee
black formatting
michael-gemmell Aug 17, 2023
d3e4e5c
mocking bda_tree import
michael-gemmell Aug 17, 2023
a7ec4cd
precommit
michael-gemmell Aug 17, 2023
b1babc4
precommit
michael-gemmell Aug 17, 2023
8e2f2d8
moving read_data tvector to kwargs
michael-gemmell Aug 17, 2023
519c4ed
precommit formatting
michael-gemmell Aug 17, 2023
f7623ce
precommit formatting
michael-gemmell Aug 17, 2023
67bd35c
precommit types
michael-gemmell Aug 17, 2023
d1afc87
precommit types
michael-gemmell Aug 17, 2023
1f38f27
precommit types
michael-gemmell Aug 17, 2023
c555f6b
precommit types
michael-gemmell Aug 17, 2023
b5d5840
precommit types
michael-gemmell Aug 17, 2023
9d12492
precommit types
michael-gemmell Aug 17, 2023
15c4aac
precommit types
michael-gemmell Aug 17, 2023
4362cd5
moving kwargs/args around
michael-gemmell Aug 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 108 additions & 40 deletions indica/bayesmodels.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,25 @@
from copy import deepcopy
import warnings

import flatdict
import numpy as np
from scipy.stats import uniform

np.seterr(all="ignore")

warnings.simplefilter("ignore", category=FutureWarning)


PROFILES = [
"electron_temperature",
"electron_density",
"ion_temperature",
"ion_density",
"impurity_density",
"fast_density",
"neutral_density",
"zeff",
]


def gaussian(x, mean, sigma):
michael-gemmell marked this conversation as resolved.
Show resolved Hide resolved
return 1 / (sigma * np.sqrt(2 * np.pi)) * np.exp(-1 / 2 * ((x - mean) / sigma) ** 2)

Expand Down Expand Up @@ -44,46 +54,85 @@ def __init__(
quant_to_optimise: list = [],
priors: dict = {},
diagnostic_models: list = [],
percent_error: float = 0.10,
):
self.plasma = plasma
self.data = data
self.quant_to_optimise = quant_to_optimise
self.diagnostic_models = diagnostic_models
self.priors = priors
self.percent_error = percent_error

for diag_model in self.diagnostic_models:
diag_model.plasma = self.plasma

missing_data = list(set(quant_to_optimise).difference(data.keys()))
if missing_data: # list of keys in quant_to_optimise but not data
if missing_data: # gives list of keys in quant_to_optimise but not data
raise ValueError(f"{missing_data} not found in data given")

def _build_bckc(self, params: dict, **kwargs):
# TODO: consider how to handle if models have overlapping kwargs
# Params is a dictionary which is updated by optimiser,
# kwargs is constant i.e. settings for models
def _build_bckc(self, params, **kwargs):
"""
Parameters
----------
params - dictionary which is updated by optimiser
kwargs - passed to model i.e. settings

Returns
-------
bckc of results
"""
self.bckc: dict = {}
for model in self.diagnostic_models:
self.bckc = dict(
self.bckc, **{model.name: {**model(**{**params, **kwargs})}}
)
self.bckc = flatdict.FlatDict(self.bckc, delimiter=".")
# removes "model.name." from params and kwargs then passes them to model
# e.g. xrcs.background -> background
_nuisance_params = {
param_name.replace(model.name + ".", ""): param_value
for param_name, param_value in params.items()
if model.name in param_name
}
_model_settings = {
kwarg_name.replace(model.name + ".", ""): kwarg_value
for kwarg_name, kwarg_value in kwargs.items()
if model.name in kwarg_name
}

_model_kwargs = {
**_nuisance_params,
**_model_settings,
} # combine dictionaries
_bckc = model(**_model_kwargs)
_model_bckc = {
f"{model.name}.{value_name}": value
for value_name, value in _bckc.items()
} # prepend model name to bckc
self.bckc = dict(self.bckc, **_model_bckc)
return

def _ln_likelihood(self):
ln_likelihood = 0
for key in self.quant_to_optimise:
# TODO: What to use as error? Assume percentage error if none given...
# Float128 is used since rounding of small numbers causes
# problems when initial results are bad fits
model_data = self.bckc[key].values.astype("float64")
# Float128 since rounding of small numbers causes problems
# when initial results are bad fits
model_data = self.bckc[key].astype("float128")
exp_data = (
self.data[key]
.sel(t=self.plasma.time_to_calculate)
.values.astype("float64")
self.data[key].sel(t=self.plasma.time_to_calculate).astype("float128")
)
_ln_likelihood = np.log(gaussian(model_data, exp_data, exp_data * 0.10))
ln_likelihood += np.nanmean(_ln_likelihood)
exp_error = (
exp_data * self.percent_error
) # Assume percentage error if none given.
if hasattr(self.data[key], "error"):
if (
self.data[key].error != 0
).any(): # TODO: Some models have an error of 0 given
exp_error = self.data[key].error.sel(
t=self.plasma.time_to_calculate
)

_ln_likelihood = np.log(gaussian(model_data, exp_data, exp_error))
# treat channel as key dim which isn't averaged like other dims
if "channel" in _ln_likelihood.dims:
_ln_likelihood = _ln_likelihood.sum(dim="channel", skipna=True)
ln_likelihood += _ln_likelihood.mean(skipna=True).values
return ln_likelihood

def _ln_prior(self, parameters: dict):
Expand Down Expand Up @@ -144,6 +193,34 @@ def sample_from_priors(self, param_names, size=10):
samples = samples[:, 0:size]
return samples.transpose()

def sample_from_high_density_region(
self, param_names: list, sampler, nwalkers: int, nsamples=100
):
start_points = self.sample_from_priors(param_names, size=nsamples)

ln_prob, _ = sampler.compute_log_prob(start_points)
num_best_points = int(nsamples * 0.05)
index_best_start = np.argsort(ln_prob)[-num_best_points:]
best_start_points = start_points[index_best_start, :]
best_points_std = np.std(best_start_points, axis=0)

# Passing samples through ln_prior and redrawing if they fail
samples = np.empty((param_names.__len__(), 0))
while samples.size < param_names.__len__() * nwalkers:
sample = np.random.normal(
np.mean(best_start_points, axis=0),
best_points_std * 2,
size=(nwalkers * 5, len(param_names)),
)
start = {name: sample[:, idx] for idx, name in enumerate(param_names)}
ln_prior = self._ln_prior(start)
# Convert from dictionary of arrays -> array,
# then filtering out where ln_prior is -infinity
accepted_samples = np.array(list(start.values()))[:, ln_prior != -np.inf]
samples = np.append(samples, accepted_samples, axis=1)
start_points = samples[:, 0:nwalkers].transpose()
return start_points

def ln_posterior(self, parameters: dict, **kwargs):
"""
Posterior probability given to optimisers
Expand All @@ -164,31 +241,22 @@ def ln_posterior(self, parameters: dict, **kwargs):
"""

ln_prior = self._ln_prior(parameters)
if ln_prior == -np.inf: # Don't call model if outside priors
if ln_prior == -np.inf: # Don't call models if outside priors
return -np.inf, {}

self.plasma.update_profiles(parameters)
self._build_bckc(parameters, **kwargs) # model calls
ln_likelihood = self._ln_likelihood() # compare results to data
ln_posterior = ln_likelihood + ln_prior

kin_profs = {
"electron_density": self.plasma.electron_density.sel(
t=self.plasma.time_to_calculate
),
"electron_temperature": self.plasma.electron_temperature.sel(
t=self.plasma.time_to_calculate
),
"ion_temperature": self.plasma.ion_temperature.sel(
t=self.plasma.time_to_calculate
),
"impurity_density": self.plasma.impurity_density.sel(
t=self.plasma.time_to_calculate
),
"zeff": self.plasma.zeff.sum("element").sel(
t=self.plasma.time_to_calculate
),
# TODO: add Nh
}
blob = deepcopy({**self.bckc, **kin_profs})
plasma_profiles = {}
for profile_key in PROFILES:
if hasattr(self.plasma, profile_key):
plasma_profiles[profile_key] = getattr(self.plasma, profile_key).sel(
t=self.plasma.time_to_calculate
)
else:
raise ValueError(f"plasma does not have attribute {profile_key}")

blob = deepcopy({**self.bckc, **plasma_profiles})
return ln_posterior, blob
36 changes: 22 additions & 14 deletions indica/models/charge_exchange.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ def __init__(
element: str = "c",
instrument_method="get_charge_exchange",
):

self.name = name
self.element = element
self.instrument_method = instrument_method
Expand All @@ -43,6 +42,12 @@ def _build_bckc_dictionary(self):
self.bckc[quantity] = self.Ti_at_channels
long_name = "Ion temperature"
units = "eV"
elif quant == "spectra":
# Placeholder
continue
elif quant == "fit":
# Placeholder
continue
else:
print(f"{quant} not available in model for {self.instrument_method}")
continue
Expand Down Expand Up @@ -115,31 +120,34 @@ def __call__(
return self.bckc


def pi_transform_example(nchannels: int):
x_positions = np.linspace(0.2, 0.8, nchannels)
y_positions = np.linspace(0.0, 0.0, nchannels)
z_positions = np.linspace(0.0, 0.0, nchannels)

transect_transform = TransectCoordinates(
x_positions,
y_positions,
z_positions,
"pi",
machine_dimensions=((0.15, 0.95), (-0.7, 0.7)),
)
return transect_transform


def example_run(
pulse: int = None,
diagnostic_name: str = "cxrs",
plasma=None,
plot=False,
):

# TODO: LOS sometimes crossing bad EFIT reconstruction

if plasma is None:
plasma = example_plasma(pulse=pulse)

# Create new interferometers diagnostics
nchannels = 5
x_positions = np.linspace(0.2, 0.8, nchannels)
y_positions = np.linspace(0.0, 0.0, nchannels)
z_positions = np.linspace(0.0, 0.0, nchannels)

transect_transform = TransectCoordinates(
x_positions,
y_positions,
z_positions,
diagnostic_name,
machine_dimensions=plasma.machine_dimensions,
)
transect_transform = pi_transform_example(5)
transect_transform.set_equilibrium(plasma.equilibrium)
model = ChargeExchange(
diagnostic_name,
Expand Down
Loading
Loading