You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thanks for this great GPy package! Recently I've been using the GPy package to implement sparse GP regression, and have encountered some issues that I haven't been able to resolve due to my ignorance of GP. I have around 60000 2-dimensional inputs and the corresponding one-dimensional output. When I tried to fit a sparse GP model, I tried two different approaches to select the inducing points: 1) choose fixed sub-data points from the raw input/output; 2) treat the inducing points as parameters and find the optimal inducing points through the optimization process.
For the first approach, the results were pretty good in terms of getting a converged log-likelihood if the number of inducing points is large (e.g., larger than 900 inducing points). However, the second approach which is essentially letting the GP find the optimal inducing points itself returns pretty bad/crazy results in terms of log-likelihood. When I print out those 'optimal' inducing points, I noticed that those optimal inducing points are mostly outside the range where my training datasets lie in. So I think one thing I could try is to sort of bound the range of inducing points during optimizing the inducing points, but I haven't been able to find the function in the current model (or maybe it is impossible or it's somewhere yet I haven't been able to find it?)
Anyway, I just want to reach out to see if there is already a function somewhere (yet I haven't been able to find it) to help me do this or if it's just mathematically impossible. Any feedback will be greatly appreciated!
Thanks and Regards!
The text was updated successfully, but these errors were encountered:
Hi,
First of all, thanks for this great GPy package! Recently I've been using the GPy package to implement sparse GP regression, and have encountered some issues that I haven't been able to resolve due to my ignorance of GP. I have around 60000 2-dimensional inputs and the corresponding one-dimensional output. When I tried to fit a sparse GP model, I tried two different approaches to select the inducing points: 1) choose fixed sub-data points from the raw input/output; 2) treat the inducing points as parameters and find the optimal inducing points through the optimization process.
For the first approach, the results were pretty good in terms of getting a converged log-likelihood if the number of inducing points is large (e.g., larger than 900 inducing points). However, the second approach which is essentially letting the GP find the optimal inducing points itself returns pretty bad/crazy results in terms of log-likelihood. When I print out those 'optimal' inducing points, I noticed that those optimal inducing points are mostly outside the range where my training datasets lie in. So I think one thing I could try is to sort of bound the range of inducing points during optimizing the inducing points, but I haven't been able to find the function in the current model (or maybe it is impossible or it's somewhere yet I haven't been able to find it?)
Anyway, I just want to reach out to see if there is already a function somewhere (yet I haven't been able to find it) to help me do this or if it's just mathematically impossible. Any feedback will be greatly appreciated!
Thanks and Regards!
The text was updated successfully, but these errors were encountered: