Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bayesian GPLVM using a specific latent input prior #1100

Open
Soham6298 opened this issue Oct 21, 2024 · 4 comments
Open

Bayesian GPLVM using a specific latent input prior #1100

Soham6298 opened this issue Oct 21, 2024 · 4 comments
Assignees
Labels
need more info If an issue or PR needs further information by the issuer

Comments

@Soham6298
Copy link

I have a question regarding implementing a specific setup using Bayesian GPLVM framework.

Setup:

I want to estimate latent X such that

Y = f(X) + error

where Y is NxD (multi-output) and X is latent. I have a prior on X such that X* ~ N(X, s). Assuming s = 0.1, I would like to use the GPLVM framework to recover posterior latent X. Since the setup is a part of a simulation study, I have true X to compute rmse for recovery.

To that end, I am using:

def gpyfit(output, input_prior):
    Q = 1 #input_dim
    m_gplvm = GPy.models.bayesian_gplvm_minibatch.BayesianGPLVMMiniBatch(output, Q, num_inducing = 12, kernel=GPy.kern.RBF(Q))
    m_gplvm.X.set_prior = np.random.normal(X, 0.1)
    m_gplvm.kern.lengthscale = scp.stats.halfnorm.rvs()
    m_gplvm.kern.variance = scp.stats.halfnorm.rvs()
    m_gplvm.likelihood.variance = scp.stats.halfnorm.rvs()
    m_gplvm.optimize(messages=1, max_iters=5e4)
    return(m_gplvm)

As it is apparent, I am setting custom priors for the covariance function hyperparameters as well as error variance.

When I am using this setup, the rmse is absurdly high, which makes me think that I am making a mistake somewhere. It will be helpful to know in case someone has already tried out a similar problem scenario, or if I am making an obvious mistake.

Thanks!

@MartinBubel MartinBubel self-assigned this Oct 24, 2024
@MartinBubel
Copy link
Contributor

Hi @Soham6298
I will take a look at this.
When testing (with a simple example), I did not experience large losses. Is it possible that you share a bit more of your code? At least in a way that, at best, I can work on similar data as you so comparability is ensured.

@MartinBubel MartinBubel added the need more info If an issue or PR needs further information by the issuer label Oct 24, 2024
@Soham6298
Copy link
Author

Hello @MartinBubel ,

Thanks a lot for looking into this.

Since I am running some extensive simulation studies that are part of a more elaborate model comparison, I have set up a small notebook that tests the GPy on 50 simulated datasets. These datasets were generated from an exact squared exponential GP with the true parameters (length scale, marginal variance and error variance) sampled from the same distribution as the priors in the GPy model.

I compute the posterior RMSE for the latent input from the GPy model with the true X from my simulated data. I also have a naive RMSE which is basically RMSE when using the prior to the latent inputs compared to ground truth. I get the following result:

Naive RMSE:0.1349896448175324
Posterior RMSE:1.0165256882560296

I am attaching the data and my notebook so that you can run it for yourself.
GPyTest.tar.gz

@MartinBubel
Copy link
Contributor

Hi @Soham6298

thanks for uploading an example!
I have looked at it but I'm afraid I need some more time until I can give a proper answer on this. Sorry! I hope this is not something urgent.

Best, Martin

@Soham6298
Copy link
Author

Hi @MartinBubel

Of course! Let me know in case you might need additional inputs from my side. Just to mention, I had also used Pyro GPLVM on the same datasets and the results for the pyro model are beyond the naive RMSE that I present in the above example.

Best,
Soham

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need more info If an issue or PR needs further information by the issuer
Projects
None yet
Development

No branches or pull requests

2 participants