-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Computation of Full Tensorial Bernstein Form (general discussion) #3
Comments
We saw that a "loop-based" approach such that: (1) uses Julia's column-major ordering (see this performance tip), i.e. you want the inner loops to run through the left-most indices, looks like the following, in the particular case of dimension 8: function fauxc(a, b, c, d, e, f, g, h)
A = Array{Float64}(length(a), length(b), length(c), length(d), length(e), length(f), length(g), length(h))
@inbounds begin
for i8 in eachindex(h)
for i7 in eachindex(g)
aux_i7_i8 = g[i7] * h[i8]
for i6 in eachindex(f)
aux_i6_i7_i8 = f[i6] * aux_i7_i8
for i5 in eachindex(e)
aux_i5_i6_i7_i8 = e[i5] * aux_i6_i7_i8
for i4 in eachindex(d)
aux_i4_i5_i6_i7_i8 = d[i4] * aux_i5_i6_i7_i8
for i3 in eachindex(c)
aux_i3_i4_i5_i6_i7_i8 = c[i3] * aux_i4_i5_i6_i7_i8
for i2 in eachindex(b)
aux_i2_i3_i4_i5_i6_i7_i8 = b[i2] * aux_i3_i4_i5_i6_i7_i8
for i1 in eachindex(a)
A[i1, i2, i3, i4, i5, i6, i7, i8] = a[i1] * aux_i2_i3_i4_i5_i6_i7_i8
end
end
end
end
end
end
end
end
end
return A
end Notes: (1) if we go for this approach, it should be possible to generalize with the |
Here is a general version that should do the same but work for an arbitrary number of vectors. However, it is a lot slower due to heavy allocations (which I do not fully understand). The sorting is not implemented, but I expect it to boost performance additionally for vectors of different length. function kron_nested(vectors::Vector{Vector{Float64}}, A=Array{Float64}(ntuple(i -> length(vectors[i]), length(vectors))))
n = length(vectors)
@assert n > 1 "need at least two vectors"
# sort by length, in descending order (i.e., smallest last)
# vectors = sort(TODO)
# precompute first multiplications
cache = Vector{Float64}(n-1)
cache[n-1] = vectors[n][1]
i = n-1
while i > 1
cache[i-1] = vectors[i][1] * cache[i]
i -= 1
end
# allocate result (commented and moved to input argument)
# A = Array{Float64}(ntuple(i -> length(vectors[i]), n))
# fill result
indices = ones(Int, n)
v = vectors[1] # leftmost vector
length_v = length(v) # length of the leftmost vector
lengths = [length(v) for v in vectors]
while true
# loop where result is written (only one multiplication missing)
indices[1] = 1
while indices[1] <= length_v
A[indices...] = v[indices[1]] * cache[1]
indices[1] += 1
end
# reset indices that reached their bounds
i = 2
while i <= n && indices[i] == lengths[i]
indices[i] = 1
i += 1
end
# termination criterion: last index reached its bounds
if i == n+1
break
end
# increment next index
indices[i] += 1
# update cache for all indices that have changed
if i == n
# special case: next index is the last index
cache[i-1] = vectors[i][indices[i]]
else
# normal case
cache[i-1] = vectors[i][indices[i]] * cache[i]
end
i -= 1
while i > 1
cache[i-1] = vectors[i][1] * cache[i]
i -= 1
end
end
return A
end |
The generalization of I copy the approach here that uses the dimensions of the preallocated output array @generated function fauxc(A::AbstractArray{T,n}, v::VectorOfArray) where {T,n}
quote
cache = Vector{Float64}($n-2)
@inbounds for $(Symbol(:i_, n)) in eachindex(v[$n])
@nloops $(n-1) i i -> eachindex(v[i]) d -> d == 1 ? nothing : d == $n-1 ? cache[d-1] = v[i_{d}, d] * v[i_{d+1}, d+1] : cache[d-1] = cache[d] * v[i_{d}, d] begin (@nref $n A i) = v[i_1, 1] * cache[1] end
end
end
end The arguments to
is just the content of the inner-most loop. |
Great ! (You did my job again Marcelo :P)
Next steps:
I will open an issue for each of these sub-problems. Let keep this issue for the general discussions. |
Thanks for the clear write-up. While reading Smith's thesis, i noticed he pays special attention to the computation of rigorous error bounds using interval arithmetic. While we don't support this yet, it shouldn't be hard to setup here, extending our interface and using I also liked his approach for an API dealing with either the Bernstein tensor form and the implicit Bernstein form (see Appendix B towards the end of the thesis), they are fields of some bigger data type that contains the polynomial and the (box) domain. I think we should follow this idea, using a type parametric in a concrete subtype of an I plan to send a PR in these directions that set the ground for your point (2) in "Next Steps" above. Points (1) and (3) are quite independent and can be done separately. |
@mforets Yes. Implicit Bernstein form is important when one wants to compute a given subset of the Bernstein coefficients (for example when bounding a polynomial). The computation of the full set of Bernstein coefficients is necessary when one wants to provide a precise convex enclosure of the polynomial function. To summarize:
In my previous work, I only used the Bernstein expansion for the min/max application. However, in a library point of view, I think it is important to give both possibilities as one may need the full Bernstein expansion. About, the interval arithmetic: Bernstein coefficients of monomials are rational. So rational arithmetic is enough currently. Interval arithmetic is important to allow us to handle cases where the coefficients of the polynomial are not rational (ex: sqrt(2), pi, etc...). |
Idea
Let
M
be a multivariate monomial.Given four variables
x_1,x_2,x_3,x_4
one example would beM = (x_1^2)*(x_2^2)*(x_4)
.In this example,
M
is constituted by the product of 4 univariate monomialsM_i
:M_1 = (x_1^2), M_2 = (x_2^2), M_3 = (x_3^0) = 1, M_4 = (x_4)
Assuming
B_i
is the Bernstein expansion ofM_i
, one can obtain the Bernstein expansionB
ofM
by computing the Kronecker products (that we note here(x)
) of its constituting univariate monomial Bernstein expansionsB_i
.B = ((B_1(x)B_2) (x)B_3) (x)B_4
Please note the ordering of the operations is important, so commutativity does not hold.
Current implementation
Points to improve
berncoeffs
array is always re-allocated, copied etc...kron
using GPU implementation ? (ex:@cuda
macro) Only when dealing with float32 ?... Worker 1:
B12 = (B_1(x)B_2)
... Worker 2:
B34 = (B_3(x)B_4)
... Syncro
... Worker 1
B = (B12 (x) B34)
Dictionaries can be used to keep the Bernstein expansion of often used multivariate polynomials.
Key = [M_degrees, expansion_degree] (in the example:
M_degrees = 2,2,0,1
)Data = B
Bernstein expansions of multivariate monomials when the
expansion_degree = M_degrees
and on the[0,1]^n
box are sparse. Currently, we do not take advantage of this.The text was updated successfully, but these errors were encountered: