-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use libCEED for element-wise error estimate integrals #219
Conversation
e952666
to
9db564b
Compare
a845072
to
3b875d4
Compare
9db564b
to
48be81b
Compare
3b875d4
to
2eb4ebe
Compare
48be81b
to
5c8863f
Compare
77f473a
to
b03405f
Compare
b03405f
to
f6a30b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Some minor suggestions on function signatures in integrator.cpp
, pushed to hughcars/estimator-libceed-dev
if of interest. There's some libceed function/files I don't think needed.
palace/fem/libceed/integrator.cpp
Outdated
if (input) | ||
{ | ||
PalaceCeedCall(ceed, | ||
CeedOperatorSetField(op, name.c_str(), restr, CEED_BASIS_NONE, v)); | ||
} | ||
else | ||
{ | ||
PalaceCeedCall(ceed, | ||
CeedOperatorSetField(op, name.c_str(), restr, CEED_BASIS_NONE, v)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each branch of these input
if is the same, so I think the branch can be removed. The only changing on input
seems to occur on the name
above. The same thing can be done with the true/false
argument on the other method, with a double dispatch, and dropping the input
bool
from AddOperatorActiveFields
:
void AddOperatorActiveInputFields(unsigned int ops, bool input, Ceed ceed,
CeedElemRestriction restr, CeedBasis basis, CeedOperator op,
std::string name = "u", CeedVector v = CEED_VECTOR_ACTIVE)
{
AddOperatorActiveFields(ops, ceed, restr, basis, op, name, v);
}
void AddOperatorActiveOutputFields(unsigned int ops, bool input, Ceed ceed,
CeedElemRestriction restr, CeedBasis basis, CeedOperator op,
std::string name = "v", CeedVector v = CEED_VECTOR_ACTIVE)
{
AddOperatorActiveFields(ops, ceed, restr, basis, op, name, v);
}
the call sites then drop the true/false
similarly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adopted these in 70aabf6.
palace/fem/libceed/integrator.cpp
Outdated
void AddQFunctionActiveInputsOutputs(unsigned int ops, bool input, Ceed ceed, | ||
CeedBasis basis, CeedQFunction qf, | ||
std::string name = "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every call for these I can see has a compile time known input
value, so this could be split to avoid the nested if statements. Then the call sites are more explicit about what's going on, without needing to check the signature for what the true/false is doing.
void AddQFunctionActiveInputs(unsigned int ops, Ceed ceed, CeedBasis basis,
CeedQFunction qf, std::string name = "u")
{
// Add inputs or outputs with evaluation modes for the active vector of a QFunction.
CeedInt num_comp;
PalaceCeedCall(ceed, CeedBasisGetNumComponents(basis, &num_comp));
if (ops & EvalMode::None)
{
PalaceCeedCall(ceed, CeedQFunctionAddInput(qf, name.c_str(), num_comp, CEED_EVAL_NONE));
}
if (ops & EvalMode::Interp)
{
CeedInt q_comp;
PalaceCeedCall(ceed,
CeedBasisGetNumQuadratureComponents(basis, CEED_EVAL_INTERP, &q_comp));
PalaceCeedCall(
ceed, CeedQFunctionAddInput(qf, name.c_str(), num_comp * q_comp, CEED_EVAL_INTERP));
}
if (ops & EvalMode::Grad)
{
CeedInt q_comp;
PalaceCeedCall(ceed,
CeedBasisGetNumQuadratureComponents(basis, CEED_EVAL_GRAD, &q_comp));
PalaceCeedCall(ceed, CeedQFunctionAddInput(qf, (std::string("grad_") + name).c_str(),
num_comp * q_comp, CEED_EVAL_GRAD));
}
if (ops & EvalMode::Div)
{
CeedInt q_comp;
PalaceCeedCall(ceed,
CeedBasisGetNumQuadratureComponents(basis, CEED_EVAL_DIV, &q_comp));
PalaceCeedCall(ceed, CeedQFunctionAddInput(qf, (std::string("div_") + name).c_str(),
num_comp * q_comp, CEED_EVAL_DIV));
}
if (ops & EvalMode::Curl)
{
CeedInt q_comp;
PalaceCeedCall(ceed,
CeedBasisGetNumQuadratureComponents(basis, CEED_EVAL_CURL, &q_comp));
PalaceCeedCall(ceed, CeedQFunctionAddInput(qf, (std::string("curl_") + name).c_str(),
num_comp * q_comp, CEED_EVAL_CURL));
}
}
void AddQFunctionActiveOutputs(unsigned int ops, Ceed ceed, CeedBasis basis,
CeedQFunction qf, std::string name = "v")
{
// Add inputs or outputs with evaluation modes for the active vector of a QFunction.
CeedInt num_comp;
PalaceCeedCall(ceed, CeedBasisGetNumComponents(basis, &num_comp));
if (ops & EvalMode::None)
{
PalaceCeedCall(ceed,
CeedQFunctionAddOutput(qf, name.c_str(), num_comp, CEED_EVAL_NONE));
}
if (ops & EvalMode::Interp)
{
CeedInt q_comp;
PalaceCeedCall(ceed,
CeedBasisGetNumQuadratureComponents(basis, CEED_EVAL_INTERP, &q_comp));
PalaceCeedCall(ceed, CeedQFunctionAddOutput(qf, name.c_str(), num_comp * q_comp,
CEED_EVAL_INTERP));
}
if (ops & EvalMode::Grad)
{
CeedInt q_comp;
PalaceCeedCall(ceed,
CeedBasisGetNumQuadratureComponents(basis, CEED_EVAL_GRAD, &q_comp));
PalaceCeedCall(ceed, CeedQFunctionAddOutput(qf, (std::string("grad_") + name).c_str(),
num_comp * q_comp, CEED_EVAL_GRAD));
}
if (ops & EvalMode::Div)
{
CeedInt q_comp;
PalaceCeedCall(ceed,
CeedBasisGetNumQuadratureComponents(basis, CEED_EVAL_DIV, &q_comp));
PalaceCeedCall(ceed, CeedQFunctionAddOutput(qf, (std::string("div_") + name).c_str(),
num_comp * q_comp, CEED_EVAL_DIV));
}
if (ops & EvalMode::Curl)
{
CeedInt q_comp;
PalaceCeedCall(ceed,
CeedBasisGetNumQuadratureComponents(basis, CEED_EVAL_CURL, &q_comp));
PalaceCeedCall(ceed, CeedQFunctionAddOutput(qf, (std::string("curl_") + name).c_str(),
num_comp * q_comp, CEED_EVAL_CURL));
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adopted these in 70aabf6.
} | ||
{ | ||
const CeedScalar u2_loc[3] = {u2[i + Q * 0], u2[i + Q * 1], u2[i + Q * 2]}; | ||
CeedScalar coeff[6], J_loc[9]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can coeff[6]
be hoisted too? Provided the MultBAx33
calls are done before each load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "hoisted" do you mean put into the outer scope? I would think any compiler would not stack allocate two coeff[6]
objects here since only one is in scope at a time. Is that not true? I hadn't thought about it but certainly moving it out could reduce local stack/register usage if that's not true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a sane compiler would notice, you're right with the scope. I was being dumb and thinking of a heap variable. Should be fine as is.
CEED_QFUNCTION(f_apply_hdivhcurl_error_33)(void *__restrict__ ctx, CeedInt Q, | ||
const CeedScalar *const *in, | ||
CeedScalar *const *out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these two functions in here, given the coefficients are allowed to be discontinuous, I think the domain spaces are both L2d rather than hdiv and hcurl, for CurlFlux and GradFlux respectively. Judging by how these functions are all written however, I don't believe this matters? I.e. the discontinuity is coming from the coefficient which is evaluated after the basis at the quadrature points. Am I understanding that correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming is addressing not the space of integration but just the space of the input functions for the interpolation/differentiation operations. That is, interpolate into physical space (requires the right Piola transformation), and only then multiply by the coefficient. Does that make sense? So the naming addresses the u
space, not the space of C * u
.
CEED_QFUNCTION(f_apply_hcurlh1d_error_33)(void *__restrict__ ctx, CeedInt Q, | ||
const CeedScalar *const *in, | ||
CeedScalar *const *out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe these functions are holdovers from when the grad flux estimation was h1d
yes? Not sure they're still needed given they'd only be applicable if there was only a constant material property coefficient. I don't think specializing the estimation class on that assumption seems worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They aren't needed but I figured they might be worth keeping in case down the line this type of element-wise integration is desired. There are a few other QFunctions that are also currently unused but left just for helpfulness later on.
CeedVector estimates_vec; | ||
ceed::InitCeedVector(estimates, ceed, &estimates_vec); | ||
|
||
// Do the integration (both input vectors are passive). For the complex case, add sum of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passive here means that neither is modified by the operations correct? so there's no possibility of race conditions/can be easily vectorized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, inputs are always read-only anyway. We have an operator d = f(u, v)
which isn't nicely representable with a single active input vector. So both are just labeled as "passive" inputs.
889860f
to
ebc2326
Compare
…on to assemble CeedOperator for element-wise integration
…es full error estimation on GPU Currently only for electrostatics.
…or error estimates
…every error estimate calculation
…into sjg/estimator-libceed-dev
ebc2326
to
3678622
Compare
Resolves #213. Full error estimate calculation now takes place on GPU without host-device transfer.
Performance comparison for estimation time (the loop over mesh elements) on a particularly large example problem (on CPU):
main
: 1209.6 s