Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delta function passes full set of experiment_data although it should only operate on conditions #81

Open
musslick opened this issue Aug 2, 2024 · 1 comment
Labels
bug Something isn't working.

Comments

@musslick
Copy link
Contributor

musslick commented Aug 2, 2024

The issue can be investigated here.

In the following script, a validation state is created to contain the validation data. However, it is problematic in the following sense: passing validation_experiment_data instead of validation_conditions to

    benchmark_state = random_sample_on_state(benchmark_state,
                                              all_conditions=validation_conditions.conditions,
                                              num_samples=num_conditions_per_cycle)

will result in the benchmark_state obtaining all experimental_data from validation_experiment_data which is undesired behavior. I assume this happens because of the Delta (the full set of experiment conditions in validation_experiment_data vs. no experiment condition in benchmark_state).

def run_simulation(num_cycles, num_conditions_per_cycle, num_initial_conditions, bms_epochs, experiment_runner, sim=0):

  # VALIDATION STATE
  # at every step of our discovery process, we will evaluate the performance
  # of the theorist against the ground truth. Here, we will define the ground
  # truth as a grid of data points sampled across the domain of the experimental
  # design space. We will store this validation set in a separate validation states

  # create AutoRA state for validation purposes
  validation_conditions = CustomState(variables=experiment_runner.variables)
  validation_experiment_data = CustomState(variables=experiment_runner.variables)

  # our validation set will be consist of a grid of experiment conditons 
  # across the entire experimental design domain
  validation_conditions = grid_pool_on_state(validation_conditions) 
  validation_experiment_data = grid_pool_on_state(validation_experiment_data) 
  validation_experiment_data = run_experiment_on_state(validation_experiment_data, experiment_runner=experiment_runner)


  benchmark_MSE_log = list()
  working_MSE_log = list()

  # INITIAL STATE
  # We begin our discovery experiment with randomly sampled data set for 10 
  # conditions. We will use the same state for each experimentalist method.

  # create initial AutoRA state which we will use for our discovery expeirments
  initial_state = CustomState(variables=experiment_runner.variables)

  # we will initiate our discovery process with 10 randomly sampled experiment conditions
  initial_state = random_pool_on_state(initial_state, 
                                      num_samples=num_initial_conditions,
                                      random_state = sim)

  # we obtain the corresponding experiment data
  initial_state = run_experiment_on_state(initial_state, experiment_runner=experiment_runner)

  # initialize benchmark state for random experimentalist
  benchmark_state = CustomState(**initial_state.__dict__) 

  # initialize working state for your custom experimentalist
  working_state = CustomState(**initial_state.__dict__)

  # for each discovery cycle
  for cycle in range(num_cycles):

    print("SIMULATION " + str(sim)  + " / DISCOVERY CYCLE " + str(cycle))

    # first, we fit a model to the data
    print("Fitting models on benchmark state...")
    benchmark_state = theorists_on_state(benchmark_state, bms_epochs=bms_epochs)
    print("Fitting models on working state...")
    working_state = theorists_on_state(working_state, bms_epochs=bms_epochs)

    # now we can determine how well the models do on the validation set
    benchmark_MSE = get_validation_MSE(validation_experiment_data, benchmark_state)
    benchmark_MSE_log.append(benchmark_MSE)

    working_MSE = get_validation_MSE(validation_experiment_data, working_state)
    working_MSE_log.append(working_MSE)

    # then we determine the next experiment condition
    print("Sampling new experiment conditions...")
    benchmark_state = random_sample_on_state(benchmark_state,
                                              all_conditions=validation_conditions.conditions,
                                              num_samples=num_conditions_per_cycle)
    working_state = custom_sample_on_state(working_state, 
                                            all_conditions=validation_conditions.conditions,
                                          num_samples=num_conditions_per_cycle)
    
    print("Obtaining observations...")
    # we obtain the corresponding experiment data
    benchmark_state = run_experiment_on_state(benchmark_state, experiment_runner=experiment_runner)
    working_state = run_experiment_on_state(working_state, experiment_runner=experiment_runner)

  return benchmark_MSE_log, working_MSE_log, benchmark_state, working_state

The wrappers are blow...

# state wrapper for grid pooler experimentalist (generates a grid of experiment conditions)
@on_state()
def grid_pool_on_state(variables):
  return Delta(conditions=grid_pool(variables))

# state wrapper for random pooler experimentalist (generates a pool of experiment conditions)
@on_state()
def random_pool_on_state(variables, num_samples, random_state=None):
  return Delta(conditions=random_pool(variables, num_samples, random_state))

# state wrapper for random experimentalist (samples experiment conditions from a set of conditions)
@on_state()
def random_sample_on_state(conditions, all_conditions, num_samples, random_state=None):
  return Delta(conditions=random_sample(all_conditions, num_samples, random_state))

# **** STATE WRAPPER FOR YOUR EXPERIMENTALIST ***
@on_state()
def custom_sample_on_state(experiment_data,  
                           models_bms, 
                           models_lr, 
                           models_polyr,
                           all_conditions,
                           num_samples=1, 
                           random_state=None):
  
  # this is just an example where we integrate the model diagreement sampler
  # into the wrapper
  conditions = model_disagreement_sample(
          all_conditions,
          models = [models_bms[-1], models_lr[-1]],
          num_samples = num_samples
      )

  return Delta(conditions=conditions)
@musslick musslick added the bug Something isn't working. label Aug 2, 2024
@younesStrittmatter
Copy link
Contributor

I can not reproduce this error. Could you show a minimal example? The following code works as intended: The experiment_data of benchmark_state_2 doesn't get altered and stays the same as the initial state:

experiment_runner = exp_learning()

validation_conditions = CustomState(variables=experiment_runner.variables)

validation_conditions = CustomState(variables=experiment_runner.variables)
validation_experiment_data = CustomState(variables=experiment_runner.variables)

  # our validation set will be consist of a grid of experiment conditons 
  # across the entire experimental design domain
validation_conditions = grid_pool_on_state(validation_conditions) 
validation_experiment_data = grid_pool_on_state(validation_experiment_data) 
validation_experiment_data = run_experiment_on_state(validation_experiment_data, experiment_runner=experiment_runner)


  # INITIAL STATE
  # We begin our discovery experiment with randomly sampled data set for 10 
  # conditions. We will use the same state for each experimentalist method.

  # create initial AutoRA state which we will use for our discovery expeirments
initial_state = CustomState(variables=experiment_runner.variables)

  # we will initiate our discovery process with 10 randomly sampled experiment conditions
initial_state = random_pool_on_state(initial_state, 
                                      num_samples=1,
                                      random_state = 1)

  # we obtain the corresponding experiment data
initial_state = run_experiment_on_state(initial_state, experiment_runner=experiment_runner)

  # initialize benchmark state for random experimentalist
benchmark_state = CustomState(**initial_state.__dict__)
benchmark_state_2 = CustomState(**initial_state.__dict__)

print(initial_state.experiment_data)
print(benchmark_state.experiment_data)

working_state = CustomState(**initial_state.__dict__)

for _ in range(2):

  # benchmark_state = theorists_on_state(benchmark_state, bms_epochs=2)
    
  # working_state = theorists_on_state(working_state, bms_epochs=2)

      

  # working_state = custom_sample_on_state(working_state,
  #                                             all_conditions=validation_conditions.conditions,
  #                                           num_samples=1)
  benchmark_state = random_sample_on_state(benchmark_state,
                                                all_conditions=validation_condititions.conditions,
                                                num_samples=1)
  benchmark_state_2 = random_sample_on_state(benchmark_state_2,
                                                all_conditions=validation_experiment_data.conditions,
                                                num_samples=1)
  
  benchmark_state = run_experiment_on_state(benchmark_state, experiment_runner=experiment_runner)
  # working_state = run_experiment_on_state(working_state, experiment_runner=experiment_runner)

  # print(initial_state.experiment_data)
  print('***')
  print(benchmark_state.experiment_data)
  
  print(benchmark_state_2.experiment_data)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working.
Projects
None yet
Development

No branches or pull requests

2 participants