Fixing issue Samples are outside the support for DiscreteUniform dist… #1835

Deathn0t · 2024-07-24T16:13:43Z

This fixes issue #1834 for MixedHMC sampling with DiscreteUniform distribution sampling outside the support without using the enumerate_support.

…ribution pyro-ppl#1834

fehiepsi · 2024-07-25T11:58:53Z

numpyro/infer/mixed_hmc.py

+            lambda idx, support: support[idx],
+            z_discrete,
+            self._support_enumerates,
+        )


Doing this might return in-support values but I worry that the algorithms are wrong. To compute potential energy correctly in the algorithm, we need to work with in-support values. I think you can pass support_enumerates into self._discrete_proposal_fn and change the proposal logic there.

proposal = random.randint(rng_proposal, (), minval=0, maxval=support_size) # z_new_flat = z_discrete_flat.at[idx].set(proposal) z_new_flat = z_discrete_flat.at[idx].set(support_enumerate[proposal])

or for modified rw proposal

i = random.randint(rng_proposal, (), minval=0, maxval=support_size - 1) # proposal = jnp.where(i >= z_discrete_flat[idx], i + 1, i) # proposal = jnp.where(random.bernoulli(rng_stay, stay_prob), idx, proposal) proposal_index = jnp.where(support_size[i] == z_discrete_flat[idx], support_size - 1, i) proposal = jnp.where(random.bernoulli(rng_stay, stay_prob), idx, support_size[proposal_index]) z_new_flat = z_discrete_flat.at[idx].set(proposal)

or at discrete gibbs proposal

proposal_index = jnp.where(support_enumerate[i] == z_init_flat[idx], support_size - 1, i) z_new_flat = z_init_flat.at[idx].set(support_enumerate[proposal_index])

Ok, thank you for the feedback. I will try this.

@fehiepsi how do you debug in numpyro? I tried jax.debug. but nothing happens.

I use print most of the time. When actual values are needed, I sometimes use jax.disable_jit()

@fehiepsi I have issues with passing enumerate supports and traced values as the support arrays can have different sizes. I was thinking maybe to just pass the "lower bound of the support" as offset and combined with support_sizes it should make the trick. Are there discrete variables where the support is not a simple discrete range with step 1 between values?

for modified_rw_proposal I think you used support_size in place of support_enumerate, shouldn't it be:

i = random.randint(rng_proposal, (), minval=0, maxval=support_size - 1) # proposal = jnp.where(i >= z_discrete_flat[idx], i + 1, i) # proposal = jnp.where(random.bernoulli(rng_stay, stay_prob), idx, proposal) proposal_index = jnp.where(support_enumerate[i] == z_discrete_flat[idx], support_size - 1, i) proposal = jnp.where(random.bernoulli(rng_stay, stay_prob), idx, support_enumerate[proposal_index]) z_new_flat = z_discrete_flat.at[idx].set(proposal)

thanks! your solutions are super cool! I haven't thought of different support sizes previously.

fehiepsi · 2024-07-27T14:52:58Z

numpyro/infer/hmc_gibbs.py

+        self._support_enumerates = np.zeros(
+            (len(self._support_sizes), max_length_support_enumerates), dtype=int
+        )
+        for i, (name, site) in enumerate(self._prototype_trace.items()):


great solution! I just have a couple of comments:

it might be better to loop over names in support_sizes and get site via site = self._prototype_trace[name]

we use ravel_pytree to flatten support_sizes. so we might want to keep the same behavior here. I don't have a great solution for this, maybe

support_enumerates = {} for name, support_size in self._support_sizes.items(): site = self._prototype_trace[name] enumerate_support = site["fn"].enumerate_support(False) padded_enumerate_support = np.pad(enumerate_support, (0, max_length_support_enumerates - enumerate_support.shape[0])) padded_enumerate_support = np.broadcast_to(padded_enumerate_support, support_size.shape + (max_length_support_enumerates,)) support_enumerates[name] = padded_enumerate_support self._support_enumerates = jax.vmap(lambda x: ravel_pytree(x)[0], in_axes=1, out_axes=1)(support_enumerates)

Deathn0t · 2024-07-29T08:18:26Z

@fehiepsi it worked fine with ravel_pytree as well, I just had to adapt the in_axes=0.

fehiepsi · 2024-07-29T13:53:43Z

I think we need to ravel along the first axis. The second axis (corresponds to max_length_support_enumerates) is the batch dimension. The current code might run but I guess things are mixed up.

fehiepsi · 2024-07-29T13:55:50Z

numpyro/infer/hmc_gibbs.py

+                for site in self._prototype_trace.values()
+                if site["type"] == "sample"
+                and site["fn"].has_enumerate_support
+                and not site["is_observed"]


nit: it is better to loop over support_sizes: for name, site in self._prototype_trace.items() if name in support_sizes

Deathn0t · 2024-07-29T14:59:12Z

I think we need to ravel along the first axis. The second axis (corresponds to max_length_support_enumerates) is the batch dimension. The current code might run but I guess things are mixed up.

the first axis is in_axes=0?

fehiepsi · 2024-07-29T16:47:12Z

we vmap over the batch axis, which is the second axis, i.e. in_axes=1

fehiepsi · 2024-07-29T16:49:06Z

Could you also add a simple test (as in the issue) for this? you can run make lint and make format to fix lint issues.

Deathn0t · 2024-07-30T07:54:42Z

I applied the lint/format and I added a test.

we vmap over the batch axis, which is the second axis, i.e. in_axes=1

ok, but the support_size values have shape ().

So the following line:

support_size.shape + (max_length_support_enumerates,),

is just equivalent to (max_length_support_enumerates,), and therefore in_axes=1 fails. What format should the enumerate_size and enumerate_support have?

Maybe you have an example where enumerate_size values have a shape different than a scalar ()? I can't think of one.

fehiepsi · 2024-07-30T09:23:18Z

That is a good point. I thought support sizes contain flatten arrays. Sorry for the confusion. I guess we need to move the enumerate dimension to the first axis before vmapping like you did

support_enumerates[name] = np.moveaxis(padded_enumerate_support, -1, 0)

Deathn0t · 2024-07-30T09:34:49Z

I tried the following direction:

        max_length_support_enumerates = np.max(
            [size for size in self._support_sizes.values()]
        )

        support_enumerates = {}
        for name, support_size in self._support_sizes.items():
            site = self._prototype_trace[name]
            enumerate_support = site["fn"].enumerate_support(True).T
            # Only the last dimension that corresponds to support size is padded
            pad_width = [(0, 0) for _ in range(len(enumerate_support.shape) - 1)] + [
                (0, max_length_support_enumerates - enumerate_support.shape[-1])
            ]
            padded_enumerate_support = np.pad(enumerate_support, pad_width)

            support_enumerates[name] = padded_enumerate_support

        self._support_enumerates = jax.vmap(
            lambda x: ravel_pytree(x)[0], in_axes=len(support_size.shape), out_axes=1
        )(support_enumerates)

which work with the following cases:

def model_1():
    numpyro.sample("x0", dist.DiscreteUniform(10, 12))
    numpyro.sample("x1", dist.Categorical(np.asarray([0.25, 0.25, 0.25, 0.25])))

def model_2():
    numpyro.sample("x0", dist.Categorical(0.25 * jnp.ones((4,))))
    numpyro.sample("x1", dist.Categorical(0.1 * jnp.ones((10,))))

def model_3():
    numpyro.sample("x0", dist.Categorical(0.25 * jnp.ones((3, 4))))
    numpyro.sample("x1", dist.Categorical(0.1 * jnp.ones((3, 10))))

But fails when I try to batch DiscreteUniform:

def model_4():
    numpyro.sample("x1", dist.DiscreteUniform(10 * jnp.ones((3,)), 19 * jnp.ones((3,))))

with the following exception which comes before the code I added (when the self._support_sizes is created):

Traceback (most recent call last):
  File "/Users/romainegele/Documents/Argonne/numpyro/test/test_distributions.py", line 3512, in <module>
    test_discrete_uniform_with_mixedhmc()
  File "/Users/romainegele/Documents/Argonne/numpyro/test/test_distributions.py", line 3501, in test_discrete_uniform_with_mixedhmc
    samples = sample_mixedhmc(model_4, num_samples, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/romainegele/Documents/Argonne/numpyro/test/test_distributions.py", line 3438, in sample_mixedhmc
    mcmc.run(key)
  File "/Users/romainegele/Documents/Argonne/numpyro/numpyro/infer/mcmc.py", line 682, in run
    states_flat, last_state = partial_map_fn(map_args)
                              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/romainegele/Documents/Argonne/numpyro/numpyro/infer/mcmc.py", line 443, in _single_chain_mcmc
    new_init_state = self.sampler.init(
                     ^^^^^^^^^^^^^^^^^^
  File "/Users/romainegele/Documents/Argonne/numpyro/numpyro/infer/mixed_hmc.py", line 88, in init
    state = super().init(rng_key, num_warmup, init_params, model_args, model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/romainegele/Documents/Argonne/numpyro/numpyro/infer/hmc_gibbs.py", line 467, in init
    site["fn"].enumerate_support(False).shape[0], jnp.shape(site["value"])
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/romainegele/Documents/Argonne/numpyro/numpyro/distributions/discrete.py", line 472, in enumerate_support
    values = (self.low + jnp.arange(np.amax(self.high - self.low) + 1)).reshape(
              ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "/Users/romainegele/miniforge3/envs/dh-3.12-240724/lib/python3.12/site-packages/jax/_src/numpy/array_methods.py", line 265, in deferring_binary_op
    return binary_op(*args)
           ^^^^^^^^^^^^^^^^
  File "/Users/romainegele/miniforge3/envs/dh-3.12-240724/lib/python3.12/site-packages/jax/_src/numpy/ufuncs.py", line 102, in fn
    return lax_fn(x1, x2) if x1.dtype != np.bool_ else bool_lax_fn(x1, x2)
           ^^^^^^^^^^^^^^
TypeError: add got incompatible shapes for broadcasting: (3,), (10,).

fehiepsi · 2024-07-30T13:52:06Z

The in_axes=len(support_size.shape) might not be the same across different latent variables. I think you can move the batch dimension to the front like in my last comment.

By the way, maybe we need to use size.reshape(-1)[0] instead of size

max_length_support_enumerates = np.max(
            [size for size in self._support_sizes.values()]
        )

fehiepsi · 2024-07-30T14:01:46Z

Hmm, there seems to have a bug at DiscreteUniform.enumerate_support. self.low should be jnp.reshape(self.low, -1)[0]

fehiepsi · 2024-09-05T08:39:59Z

@Deathn0t The fix is in #1859. Could you test whether the change works now?

Deathn0t · 2024-09-05T08:45:02Z

@fehiepsi sorry for the delay... other things happened I couldn't follow up. Yes, let me test this now!

…tests are passing when using changes from PR pyro-ppl#1859

Deathn0t · 2024-09-05T09:03:32Z

@fehiepsi the 4 cases I put in the test are now passing, assuming changes from #1859 are used!

Fixing issue Samples are outside the support for DiscreteUniform dist…

59de90f

…ribution pyro-ppl#1834

fehiepsi reviewed Jul 25, 2024

View reviewed changes

updated with enumerate support as padded zeros arrays

0cd448f

fehiepsi reviewed Jul 27, 2024

View reviewed changes

updating the logical using ravel to maintain a consistant behaviour

e14eea7

fehiepsi reviewed Jul 29, 2024

View reviewed changes

iterating of support_sizes

10548fe

Deathn0t added 2 commits July 30, 2024 09:32

fixed lint issues

f4b9d99

adding test for mixed hmc sampling of distribution discrete uniform

fe46ba1

fehiepsi added awaiting review awaiting response and removed awaiting review labels Aug 9, 2024

fehiepsi mentioned this pull request Sep 5, 2024

Fix DiscreteUniform.enumerate_support with non-trivial batch shape #1859

Merged

hmc_gibbs updated to work with different support sizes and batching, …

17147bb

…tests are passing when using changes from PR pyro-ppl#1859

applying format and changes similar to PR 1859

b1f36dc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing issue Samples are outside the support for DiscreteUniform dist… #1835

Fixing issue Samples are outside the support for DiscreteUniform dist… #1835

Deathn0t commented Jul 24, 2024

fehiepsi Jul 25, 2024

Deathn0t Jul 25, 2024

Deathn0t Jul 25, 2024

fehiepsi Jul 25, 2024

Deathn0t Jul 26, 2024

Deathn0t Jul 26, 2024

fehiepsi Jul 27, 2024

fehiepsi Jul 27, 2024 •

edited

Loading

Deathn0t commented Jul 29, 2024

fehiepsi commented Jul 29, 2024

fehiepsi Jul 29, 2024

Deathn0t commented Jul 29, 2024

fehiepsi commented Jul 29, 2024 •

edited

Loading

fehiepsi commented Jul 29, 2024

Deathn0t commented Jul 30, 2024 •

edited

Loading

fehiepsi commented Jul 30, 2024 •

edited

Loading

Deathn0t commented Jul 30, 2024 •

edited

Loading

fehiepsi commented Jul 30, 2024

fehiepsi commented Jul 30, 2024

fehiepsi commented Sep 5, 2024

Deathn0t commented Sep 5, 2024

Deathn0t commented Sep 5, 2024

Fixing issue Samples are outside the support for DiscreteUniform dist… #1835

Are you sure you want to change the base?

Fixing issue Samples are outside the support for DiscreteUniform dist… #1835

Conversation

Deathn0t commented Jul 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fehiepsi Jul 27, 2024 • edited Loading

Choose a reason for hiding this comment

Deathn0t commented Jul 29, 2024

fehiepsi commented Jul 29, 2024

Choose a reason for hiding this comment

Deathn0t commented Jul 29, 2024

fehiepsi commented Jul 29, 2024 • edited Loading

fehiepsi commented Jul 29, 2024

Deathn0t commented Jul 30, 2024 • edited Loading

fehiepsi commented Jul 30, 2024 • edited Loading

Deathn0t commented Jul 30, 2024 • edited Loading

fehiepsi commented Jul 30, 2024

fehiepsi commented Jul 30, 2024

fehiepsi commented Sep 5, 2024

Deathn0t commented Sep 5, 2024

Deathn0t commented Sep 5, 2024

fehiepsi Jul 27, 2024 •

edited

Loading

fehiepsi commented Jul 29, 2024 •

edited

Loading

Deathn0t commented Jul 30, 2024 •

edited

Loading

fehiepsi commented Jul 30, 2024 •

edited

Loading

Deathn0t commented Jul 30, 2024 •

edited

Loading