-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ordered simplex #38
Comments
Thanks. It's really remarkable how the ordering constraint works to give you order statistics. I still don't quite have my head around why that's true, but I've verified empirically. That is, we get the same distribution on ordered[N] alpha;
...
alpha ~ foo(theta); and vector[N] alpha_unordered;
...
alpha_unordered ~ foo(theta);
alpha = sort_asc(alpha_unordered); Is the application to order statistics? We're moving the language toward allowing composition of constraints. So rather than an ordered simplex, I think we can just compose the ordered transform with the simplex transform. I really want to get this into Stan to deal with the case of non-centered parameterizations of lognormal distributions. |
The order statistic stuff is really cool. It's useful to model information given in high/low or quartiles. Lots of data is being aggregated for use in my line of work due to privacy laws. Personally, I'm in favor and it is an interesting modeling problem. |
I don't see how we can combine the ordered and simplex transforms to get this transform and for it to be uniform because the constraints are coupled together. In that, the ordering impacts the simplex at each iteration in the loop below. Whenever I attempted to compose these two transforms together via one after another the ordered simplex transform doesn't give out a "uniform" ordered simplex (approximately uniform distance between each value). One of the end points is typically much too large. I wonder how tfp manages to compose these two transforms together and if it's also uniform on the ordered simplex? data {
int<lower=0> N;
vector<lower=0>[N] alpha;
}
transformed data {
int Nm1 = N - 1;
}
parameters {
vector[Nm1] y;
}
transformed parameters {
simplex[N] x ;
real log_abs_det = 0;
{
real remaining = 1; // Remaining amount to be distributed
real lb = 0; // The minimum for the next element
real ub = 1;
for(i in 1:Nm1) {
int N_prime = Nm1 + 2 - i; // Number of remaining elements
// First constrain to [0; 1 / N_prime]
ub = inv(N_prime);
real xcons = ub * inv_logit(y[i]);
log_abs_det += log(ub) + log_inv_logit(y[i]) + log1m_inv_logit(y[i]);
// Add the lowest element log density
log_abs_det += log(N_prime - 1) + log(N_prime) + (N_prime - 2) * log1m(N_prime * xcons);
x[i] = lb + remaining * xcons;
lb = x[i];
// We added remaining * x_cons to each of the N_prime elements yet to be processed
remaining -= remaining * xcons * N_prime;
}
x[N] = lb + remaining;
}
}
model {
target += log_abs_det;
// target += target_density_lp(x, alpha);
} |
Ah, that makes sense. We assume uniform on the unconstrained scale leads to uniform simplexes, but nothing says that uniform on the ordered scale leads to uniform ordered simplexes. Also, it's not clear where we want a vector of 0 values to initialize (mean initialization value for Stan). If I put all 0s through the ordered transform, I get $\begin{bmatrix} 0 & 1 & \cdots & N-1 \end{bmatrix}$ in |
Something neat about this is that the average value within The ccdf is just 1 minus this and we can use that to calculate the mean value. where I've substituted I can calculate an offset at each iteration s.t. the offset is equal to this average value. Is that what you mean? data {
int<lower=0> N;
vector<lower=0>[N] alpha;
}
transformed data {
real half_logN = 0.5 * log(N);
int Nm1 = N - 1;
}
parameters {
vector[Nm1] y;
}
transformed parameters {
simplex[N] x ;
real log_abs_det = 0;
{
real remaining = 1; // Remaining amount to be distributed
real lb = 0; // The minimum for the next element
real ub = 1;
for(i in 1:Nm1) {
int N_prime = Nm1 + 2 - i; // Number of remaining elements
real off_set = logit(1/N_prime^2);
//First constrain to [0; 1 / N_prime]
ub = inv(N_prime);
real xcons = ub * inv_logit(off_set + y[i]);
log_abs_det += log(ub) + log_inv_logit(off_set + y[i]) + log1m_inv_logit(off_set + y[i]);
// Add the lowest element log density
log_abs_det += log(N_prime - 1) + log(N_prime) + (N_prime - 2) * log1m(N_prime * xcons);
x[i] = lb + remaining * xcons;
lb = x[i];
//We added remaining * x_cons to each of the N_prime elements yet to be processed
remaining -= remaining * xcons * N_prime;
}
x[N] = lb + remaining;
}
}
model {
target += log_abs_det;
// target += target_density_lp(x, alpha);
} |
Another way to achieve the same thing is the "augmented approach". The proof is at https://mathoverflow.net/a/312594. data {
int<lower=0> N;
// vector<lower=0>[N] alpha;
}
transformed data {
vector[N] lsv = 1 ./ reverse(linspaced_vector(N, 1, N));
}
parameters {
vector<lower=0>[N] a;
}
transformed parameters {
vector[N] x;
real a_sum = sum(a);
for (i in 1:N) {
x[N - i + 1] = dot_product(a[1:N - i + 1], lsv[1:N - i + 1]);
}
x /= a_sum;
}
model {
target += -a_sum; // a ~ exponential(1)
}
generated quantities {
real x_sum = sum(x);
} |
My notes: the expected values of the uniform simplex are then Let's see if it works. In R spacings <- function(N) {
nums <- 0:(N - 1)
s <- 1/(N * (N - nums))
data.frame(vals = cumsum(s), spacings = s)
}
spacings(5)
> spacings(5)
vals spacings
1 0.0400000 0.04000000
2 0.0900000 0.05000000
3 0.1566667 0.06666667
4 0.2566667 0.10000000
5 0.4566667 0.20000000 In Stan
|
See https://discourse.mc-stan.org/t/ordered-simplex-constraint-transform/24102/17?u=spinkney
I'll add the Stan code for this. Martin helped a lot and this paper made it clear for me https://arxiv.org/pdf/0708.0176.pdf
The text was updated successfully, but these errors were encountered: