Skip to content

Commit

Permalink
BDA (#310)
Browse files Browse the repository at this point in the history
* function autocorr plot added

* ion density added to kinetic profiles saved

* inital commit of abstract workflow and example of usage

* doc strings

* renaming workflow

* Ti_ref as default

* abstract methods added

* Changed to new MDSPlus structure

* Adding default methods

* moved sampling to it's own function in abstract class

* moved sampling to it's own function in abstract class

* renamed abstract class to AbstractBayesWorkflow

* comments

* added high density sampling as method

* added attributes to __init__

* error handling __init__

* error handling __init__

* moved read data to abstract class

* initial commit

* kwargs added

* xrcs wavelength units corrected

* background to int

* stash

* renamed impurities

* stashing

* stashing

* stashing

* fixing key names

* updating and testing example

* added background as attribute of helike model

* fixed phantom methods

* rearranged methods

* moved background to call kwarg

* params and kwargs now are given as model_{var} and model prefix is removed before model call

* adding _phantom_data and _exp_data abstract methods

* renaming to BayesWorkflowExample

* initial commit

* renamed phantom_params to profile_params

* renamed bayesopt -> bayesmodel

* renamed bayesopt -> bayesmodel

* moved start point sampling to its own method _sample_start_points

* fixed variable name

* fixed bug with start_points being overwritten

* removed workflow_dev

* moving equilibrium to workflow object

* when reading raw data save transforms to their own attribute

* transforms saved to workflow class

* fixed name

* renamed example transform

* made example los function

* minor name fix

* made read_test_data function

* made example los function

* minor type

* added reader to read_test_data method

* fixed copying workflow object states

* updated example to work with assign_profiles

* can now run with pulse = None and synthetic transforms/equilibrium

* formatting

* refactoring names

* deleted old bayes_models tests

* reformatted _build_bckc for readability

* moved percentage error to class attribute

* fixing init percent error

* black formatting

* renamed kin_prof -> plasma_profiles

* refactored window handling

* replaced doppler_broaden with physics.ev_doppler

* removed methods from __init__

* stashing

* setup_plasma now takes kwargs

* adding plasma to models now happens in setup_opt_data

* workflow broken up into methods

* removed redundant kwargs

* fixed non_plasma call option

* fixed plasma_initialisation

* fixing violin plotting

* units for xrcs.spectra.wavelength fixed

* nchannels added to example_los

* adding print message for fake data reading

* adjusting example los

* bckc method not printing spectra/fit not available everytime

* adjusting priors

* moved kwargs to sample function

* nsamples kwarg added to sample_from_high_density_region

* mocked bda_tree module

* fixed import

* fixed import

* black formatting

* black formatting

* fixed where mocked module is imported

* black formatting

* mocking bda_tree import

* precommit

* precommit

* moving read_data tvector to kwargs

* precommit formatting

* precommit formatting

* precommit types

* precommit types

* precommit types

* precommit types

* precommit types

* precommit types

* precommit types

* moving kwargs/args around

* adding mode/run to main call

* pressure/stored energy are now saved/written

* removing for loops from pressure/ion density/zeff calculations

* removing for loops from ion density calculations

* defaulting to using cached calculations -> factor 2 speed upgit add indica/models/plasma.py

* example transform moved to own function

* t = time_to_calculate and bckc don't print if chi2 not included

* Thompson Scattering added

* error checking for missing data

* in _make_spectra moved sorting wavelengths to end of method and fixed wavelength co-ordinates that were added incorrectly

* integrating spectra was removing nans so added them back post integration

* pixel offset added as __call__ option

* time vector in INPUTS added

* adjusting calibration factor

* max-min ranges now 0.5-99.5

* high_density_sampling now uses 3 best points

* Gelmin-Rubin diagnostic added to optimisation node

* Gelman-Rubin diagnostic added to tree

* stashing

* now accepts variable number of input parameters and fills missing values with default settings

* priors now more constrained for wped/wcenter

* priors now more constrained for wped/wcenter

* fast particles from ASTRA included

* ASTRA options added to BDA

* chers added

* chers added

* Extra ASTRA options implemented

* prior for ne.wped increased to 30

* not automatically adding equil to transforms now

* Fixed plotting for multiple time points

* save_pickle now takes dictionary as argument

* stashing

* stashing

* Rough batch script for running BDA

* sampler stopping condition based on moments added

* cleaned up and added virtual observations

* normalising TS profiles before fitting

* background can be class attribute or call argument

* placeholder for filtering methods

* gutted workflow class / methods removed to context objects

* gutted workflow class / methods removed to context objects

* moved ln_prior to stand alone func

* Initialisation fixed

* fixed formatting/plotting of optimiser results and adjusted moments stopping criteria to normalise for chain length

* TS dimension bug fix added

* formatting for results dict fixed

* Now plots for all time points

* renamed some nodes and removed TIME_BINS

* renamed bayes_workflow_example -> bayes_workflow

* missing self.

* missing self.

* removed debug printing from sample_with_moments

* renamed batch_bda -> bda_run

* debug printing for stopping condition is now an optional input

* moved back to flat delta moment stopping condition

* default to plotting to test

* comments

* moved sample_from_priors to external function

* moved checking of prior keys from sample_from_priors to BayesSettings dataclass

* _build_bckc now creates nested dictionary instead of flat

* filtering now works on spectra with t dim

* renamed bayes_settings -> blackbox_settings

* renamed bayes_settings -> blackbox_settings

* renamed bayes_settings -> blackbox_settings

* renamed bayes_settings -> blackbox_settings

* added mock transforms to default_factory

* cxfi_tws_c example transform added

* ModelSettings dataclass created to handle model initialisation and calling settings

* missing .self

* stopping criteria sampling rate added to OptimiserSettings

* typo

* added moments tests

* adjusted nimp.y0 prior

* example run updated

* more tests for walkers

* typo

* 11089 pulse added

* temporarily added channel filters for TS / CXFF_TWS_C

* 11089 run added

* expanded wped prior for fitting the pedestal

* Test for walker moves

* testing DE-MCMC

* Full 11089 run with DEMOVES

* testing without emcee move.DESnookerMove

* Testing without DESnookerMove

* fixed rho_poloidal -> rhop

* removed wrapper of kernels

* dt added to read_ts

* using binned data for fits instead of raw

* hack to prevent dtype interpolation error

* closing plots after use

* set_ts_profiles added to plasma_context

* update_profiles now only updates profiles which match parameters provided

* __main__ params adjusted

* reformatting __main__ to run mock example

* fixed error in zeff calculatio

* removed reading ts and added post_process_ts

* reorganising post_processing args

* R_midplane added to profiles

* gitcommit and user added

* R_midplane added

* R_midplane co-ordinate transform added

* violin plots take error as input

* example workflow corrected

* error handling for empty data dicts

* updated to include set_ts option

* python 3.8 -> poetry 3.9

* python >= 3.9 required

* pint added for standard_utility

* default to 43000000

* bda_nodes function is now a global var

* check for tree and use NEW or EDIT mode

* check for tree and use NEW or EDIT mode

* renaming Nimp -> niz1

* Plasma updating now works with Niz1/2_prof instead of Nimp_prof

* Changing default use case

* Nimp_prof -> Niz1_prof

* Option for fitting rho profiles as if symmetric in -rho

* Option for fitting rho profiles as if symmetric in -rho

* Alsu's 11XXX runs added

* default values adjusted for Nimp

* priors adjusted for Nimp

* 11032 added

* fixed missing self.

* fixed missing self.

* initial commit

* mocking opt_data keys

* speeding up run by limiting iterations

* reverting temporary bugfix

* removing comment

* if neither mds_write or plot is true -> don't save pickle

* gitpython added

* commenting out failing test due to helike model using sel instead of interp in call

* pre-commit

* mypy fixes

* Reinstated set_equilibrium with default == False; removed additional raw_data_ dictionary

* Ran: poetry update

---------

Co-authored-by: marcosertoli <[email protected]>
Co-authored-by: Marco Sertoli <[email protected]>
  • Loading branch information
3 people authored Feb 7, 2024
1 parent 543290e commit 99d415c
Show file tree
Hide file tree
Showing 22 changed files with 4,342 additions and 2,740 deletions.
147 changes: 147 additions & 0 deletions indica/bayesblackbox.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
from copy import deepcopy
import warnings

from flatdict import FlatDict
import numpy as np
from scipy.stats import uniform

np.seterr(all="ignore")
warnings.simplefilter("ignore", category=FutureWarning)


def gaussian(x, mean, sigma):
return 1 / (sigma * np.sqrt(2 * np.pi)) * np.exp(-1 / 2 * ((x - mean) / sigma) ** 2)


def get_uniform(lower, upper):
# Less confusing parameterisation of scipy.stats uniform
return uniform(loc=lower, scale=upper - lower)


def ln_prior(priors: dict, parameters: dict):
ln_prior = 0
for prior_name, prior_func in priors.items():
param_names_in_prior = [x for x in parameters.keys() if x in prior_name]
if param_names_in_prior.__len__() == 0:
# if prior assigned but no parameter then skip
continue
param_values = [parameters[x] for x in param_names_in_prior]
if hasattr(prior_func, "pdf"):
# for scipy.stats objects use pdf / for lambda functions just call
ln_prior += np.log(prior_func.pdf(*param_values))
else:
# if lambda prior with 2+ args is defined when only 1 of
# its parameters is given ignore it
if prior_func.__code__.co_argcount != param_values.__len__():
continue
else:
# Sorting to make sure args are given in the same order
# as the prior_name string
name_index = [
prior_name.find(param_name_in_prior)
for param_name_in_prior in param_names_in_prior
]
sorted_name_index, sorted_param_values = (
list(x) for x in zip(*sorted(zip(name_index, param_values)))
)
ln_prior += np.log(prior_func(*sorted_param_values))
return ln_prior


class BayesBlackBox:
"""
Bayesian black box model that creates _ln_posterior function
from plasma and diagnostic model objects
Parameters
----------
data
processed diagnostic data of format [diagnostic].[quantity]
plasma_context
plasma context has methods for using plasma object
model_handler
model_handler object to be called by ln_posterior
quant_to_optimise
quantity from data which will be optimised with bckc from diagnostic_models
priors
prior functions to apply to parameters e.g. scipy.stats.rv_continuous objects
"""

def __init__(
self,
data: dict,
quant_to_optimise: list,
priors: dict,
plasma_context=None,
model_context=None,
percent_error: float = 0.10,
):
self.data = data
self.quant_to_optimise = quant_to_optimise
self.priors = priors

self.plasma_context = plasma_context
self.model_context = model_context

self.percent_error = percent_error

missing_data = list(set(quant_to_optimise).difference(data.keys()))
if missing_data: # gives list of keys in quant_to_optimise but not data
raise ValueError(f"{missing_data} not found in data given")

def ln_likelihood(self):
ln_likelihood = 0
for key in self.quant_to_optimise:
time_coord = self.plasma_context.plasma.time_to_calculate
model_data = self.bckc[key]
exp_data = self.data[key].sel(t=time_coord)
exp_error = exp_data * self.percent_error

if hasattr(self.data[key], "error"):
if (self.data[key].error != 0).any():
# TODO: Some models have an error of 0 given
exp_error = self.data[key].error.sel(t=time_coord)

_ln_likelihood = np.log(gaussian(model_data, exp_data, exp_error))
# treat channel as key dim which isn't averaged like other dims
if "channel" in _ln_likelihood.dims:
_ln_likelihood = _ln_likelihood.sum(dim="channel", skipna=True)
ln_likelihood += _ln_likelihood.mean(skipna=True).values
return ln_likelihood

def ln_posterior(self, parameters: dict, **kwargs):
"""
Posterior probability given to optimisers
Parameters
----------
parameters
inputs to optimise
kwargs
kwargs for models
Returns
-------
ln_posterior
log of posterior probability
blob
model outputs from bckc and kinetic profiles
"""

_ln_prior = ln_prior(self.priors, parameters)
if _ln_prior == -np.inf: # Don't call models if outside priors
return -np.inf, {}

self.plasma_context.update_profiles(parameters)
plasma_attributes = self.plasma_context.return_plasma_attrs()

self.bckc = FlatDict(
self.model_context._build_bckc(parameters, **kwargs), "."
) # model calls

_ln_likelihood = self.ln_likelihood() # compare results to data
ln_posterior = _ln_likelihood + _ln_prior

blob = deepcopy({**self.bckc, **plasma_attributes})
return ln_posterior, blob
Loading

0 comments on commit 99d415c

Please sign in to comment.