-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Strings and Rationals to the Princess backend #391
Conversation
Please apply 'ant format-diff' before commiting, or the extended version 'ant format-source'.
Scala is not upwards and not downwards compatible. The upcoming Ostrich library is only available for Scala 2.12, so we downgrade Scala for Princess.
This is an initial step and the usage is still unclear. This is not tested.
… theory and arrays. Princess intenally converts all real-arrays to int-arrays and thus there is some casting and quantification in SMT.
Princess needs non-rational UF arguments to be explicitly casted to rational.
We're now evaluating formulas directly on the stack as there are several issues in Princess that block us from using the partial model with rational numbers. Once those issues have been resolved we can revert this change and move back to using the partial model.
… would unregister themselves from the prover while being closed by it.
Princess/Ostrich has released new versions of their binaries that include the latest bug fixes for this branch. @kfriedberger Could you handle uploading the files to our repository? We'll need version |
@baierd Could you have another look at this branch to see if there are any remaining issues? |
Hi @daniel-raffler . PS: (*) except in very rare and really critical cases 😄 |
This fixes a compilation error when ant tries to find JavaCup 11a. We already provide JavaCup 11b explicitly.
build/ivysettings.xml
Outdated
<ivy pattern="https://repo1.maven.org/maven2/[organisation]/[module]/ivy-[revision].xml"/> | ||
<artifact pattern="https://repo1.maven.org/maven2/[organisation]/[module]/[revision]/[artifact]-[revision].[ext]"/> | ||
</url> | ||
</chain> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this change or is this just for development?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it was for development only. Once everything is in our repository it can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've restored the original version now.
* updated. | ||
*/ | ||
protected final List<IFormula> evaluatedTerms = new ArrayList<>(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a bigger issue for Princess.
Can we add a plain test for Princess (without this code) to see where Princess fails to be consistant?
Can we ask the authors to provide a consistant model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a bigger issue for Princess. Can we add a plain test for Princess (without this code) to see where Princess fails to be consistant? Can we ask the authors to provide a consistant model?
I'm afraid in this case the problem here is really with the work-around, and not with Princess. We effectively emulate (get-values (<expr>))
with a sequence of commands:
(push 1)
(assert (= <new_variable> <expr>))
(check-sat)
(get-model)
// Now get the value from the entry for <new_variable>
(pop 1)
The problem is than the sat check causes the solver to forget the old model, and because of this it can choose a different assignment for the variables. For instance, if the only assertion on the stack is x < y
then calling evaluate(x)
twice may return different values as the solver simply chooses a different model each time. You can see this by replacing the code in PrincessAbstractProver.getEvaluatedTerms()
with ImmutableList.of()
. This should cause tests in ModelTest
and SolverAllSatTest
to fail as the model is not kept consistent across multiply calls to evaluate()
.
EDIT:
I did some testing earlier and we may not actually need the call to prover.addEvaluatedTerm()
in PrincessModel.evalImpl()
. Just adding the assignments from the model in PrincessAbstractProver.isUnsat()
seems to be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of PrincessAbstractProver.isUnsat()
should not depend on earlier model evaluations, because it can produce a completely different model. The current implementation seems to work well, even if its runtime is not optimal. We can improve this in the future.
// Extending the partial model does not seem to work in Princess if the formula uses rational | ||
// variables. To work around this issue we (temporarily) add the formula to the assertion | ||
// stack and then repeat the sat-check to get the value. | ||
if (expr instanceof ITerm) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How expensive is this workaround?
Can we ask the Princess developers for a proper evaluation? I would assume that Princess could implement this internally, including caching for repeated calls, memoization of assignments comparing across different evaluations, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How expensive is this workaround?
I've just run bmc
on the sv-comp
task set as a benchmark:
The brown line uses Princess with the main
, and the green(?) line is for the updated version from this branch. Since we're not using floating point or string formulas, most of the difference between the two should be down to the changes in evalImpl
(and possibly some changes in Princess itself). Note that the old version (from main
) actually crashes on quite a number of the tasks due to #308. The workaround has the side-effect of also fixing this issue. If we filter for just the tasks that return the correct result with the Princess version from main
the graph looks like this:
From those results it doesn't look like the work-around is adding too much overhead, especially as the problem size increases. It might still be worth looking into other benchmarks that don't use bmc
and see if there is any significant difference there.
Can we ask the Princess developers for a proper evaluation? I would assume that Princess could implement this internally, including caching for repeated calls, memoization of assignments comparing across different evaluations, etc.
That would of course be ideal. We already have two Princess tickets (uuverifiers/princess#7 and uuverifiers/princess#8) about issues with the evaluation of rational formulas, and another one (uuverifiers/princess#17) for the crashes on integer formulas that I mentioned earlier (see #308). The first two are already quite old and I don't know what the current status is. Maybe we could just ask for an update? For the last one I was told that they will need more time for a fix as the issues doesn't seem to be entirely trivial. Because of that I would prefer to merge this branch with the workaround now and then come back later for a proper fix.
This will revert the changes made in 262bb96 to simplify development.
Great work! Thanks for your help.
Thanks for the tip! I'll have to look into this and see if I actually have write-access to the SVN repository. |
and change import of Map-Entry for consistency.
} | ||
} | ||
return builder.toString(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment, not blocking the merge:
Does escaping and unescaping with Strings align with all operations like String comparison and String inclusions? This feature seems to be quite complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's a good question..
I wrote this method since many/most of the tests in StringFormulaManagerTest
use (SMTLIB) escaped Unicode, which Princess does not seem to support. It should be fine for that purpose, but I'm not sure how well escaping/unescaping will work in general. This shouldn't be a problem for this pull request as we're currently only unescaping input Strings, but don't try to escape the output.
However, we might still want to have a closer look at this. To be honest I'm not even quite sure what the format of value
in StringFormulaManager.makeString(String value)
is supposed to be. Several of the tests use SMTLIB escape sequences here, but the argument is really just a regular Java String. Do we want allow Unicode characters in it or not? When I try makeString("\\u{39E}")
different solver return different results: CVC5 and Princess recognize the escape sequence and treat it as "Ξ"
, whereas CVC4 and Z3 do not and simply return \u{39E}
again. On the other hand makeString("Ξ")
will throw and exception for CVC4/5 (because of the invalid character), however it passes for Princess, and Z3 will read the String as \u{ce}\u{9e}
, which is the UTF-8 encoding of "Ξ"
and doesn't seem quite right either.
Maybe it would be best to figure this out as part of another issue and then come up with a common definition? In the mean time I may also ask the Ostrich developers to add support for escape Unicode characters as they are part of the SMTLIB standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened a new issue for that: #412
Feel free to add more details there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks good to me.
There are some bugs or missing features in Princess.
We can improve this in the future.
@daniel-raffler Feel free to merge this PR.
Hello everyone,
the goal of this pull request is to add two more theories for strings and rational numbers to the Princess backend. It is a continuation of the work that went into #257. Some of the issues that blocked us from merging it back then have since been resolved, and at least the rational formula manager is actually starting to look pretty good now. Here is a comparison between Princess and some other solvers when running BMC on the
ReachSafety-Floats
task with optioncpa.predicate.encodeFloatAs=RATIONAL
set to test the rational formula manager:It's a bit harder to evaluate how well the String manager is working now. The Ostrich team has been very helpful and they quickly fixed a couple of bugs that we've reported. However, since CPAchecker doesn't use String theory there is no easy way of trying it out.
Linked issues:
Princess (Rationals)
Ostrich (Strings)
I've added (partial) workarounds for the two Princess issues, but some work may still be needed. For Ostrich the biggest limitation is that the solver can only handle constraints where (at least) one of the sides is a singleton (= constant) string. Something like "a str.<= b" where both
a
andb
are variables is already impossible. This is a known issue and they're planning a redesign for the solver that will allow it to handle such constraints. In the mean time we might want to add some additional error messages to make sure that the user knows why their program is failing.