Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Strings and Rationals to the Princess backend #391

Merged
merged 117 commits into from
Nov 24, 2024

Conversation

daniel-raffler
Copy link
Contributor

@daniel-raffler daniel-raffler commented Sep 13, 2024

Hello everyone,
the goal of this pull request is to add two more theories for strings and rational numbers to the Princess backend. It is a continuation of the work that went into #257. Some of the issues that blocked us from merging it back then have since been resolved, and at least the rational formula manager is actually starting to look pretty good now. Here is a comparison between Princess and some other solvers when running BMC on the ReachSafety-Floats task with option cpa.predicate.encodeFloatAs=RATIONAL set to test the rational formula manager:

Screenshot from 2024-10-17 14-14-36

It's a bit harder to evaluate how well the String manager is working now. The Ostrich team has been very helpful and they quickly fixed a couple of bugs that we've reported. However, since CPAchecker doesn't use String theory there is no easy way of trying it out.

Linked issues:

Princess (Rationals)

Ostrich (Strings)

I've added (partial) workarounds for the two Princess issues, but some work may still be needed. For Ostrich the biggest limitation is that the solver can only handle constraints where (at least) one of the sides is a singleton (= constant) string. Something like "a str.<= b" where both a and b are variables is already impossible. This is a known issue and they're planning a redesign for the solver that will allow it to handle such constraints. In the mean time we might want to add some additional error messages to make sure that the user knows why their program is failing.

serras and others added 30 commits December 6, 2021 11:34
Please apply 'ant format-diff' before commiting,
or the extended version 'ant format-source'.
Scala is not upwards and not downwards compatible.
The upcoming Ostrich library is only available for Scala 2.12,
so we downgrade Scala for Princess.
This is an initial step and the usage is still unclear. This is not tested.
… theory and arrays.

Princess intenally converts all real-arrays to int-arrays
and thus there is some casting and quantification in SMT.
Princess needs non-rational UF arguments to be explicitly casted to rational.
We're now evaluating formulas directly on the stack as there are several issues in Princess that block us from using the partial model with rational numbers. Once those issues have been resolved we can revert this change and move back to using the partial model.
… would unregister themselves from the prover while being closed by it.
@daniel-raffler daniel-raffler marked this pull request as ready for review October 29, 2024 14:57
@daniel-raffler
Copy link
Contributor Author

Princess/Ostrich has released new versions of their binaries that include the latest bug fixes for this branch.

@kfriedberger Could you handle uploading the files to our repository? We'll need version 2024-11-08 for Princess (maven) and 1.4.1 for Ostrich (maven). I believe some of the dependencies will also have to be updated and uploaded to our mirror.

@daniel-raffler
Copy link
Contributor Author

@baierd Could you have another look at this branch to see if there are any remaining issues?

@kfriedberger
Copy link
Member

kfriedberger commented Nov 16, 2024

Hi @daniel-raffler .
I uploaded Princess 2024-11-08 and Ostrich 1.4.1 to the Sosy-Lab Ivy repository.

PS:
You can also do the Ivy publication on your own. The requirement is a Sosy-Lab SVN account with write-access to the SVN repository at https://svn.sosy-lab.org/software/ivy/ . The command lines are documented in Libraries.txt.
Please note that that repository must be used in "append-only" style, i.e. you must never ever (*) modify or remove an existing library or XML file.

(*) except in very rare and really critical cases 😄

This fixes a compilation error when ant tries to find JavaCup 11a. We already provide JavaCup 11b explicitly.
<ivy pattern="https://repo1.maven.org/maven2/[organisation]/[module]/ivy-[revision].xml"/>
<artifact pattern="https://repo1.maven.org/maven2/[organisation]/[module]/[revision]/[artifact]-[revision].[ext]"/>
</url>
</chain>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this change or is this just for development?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was for development only. Once everything is in our repository it can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've restored the original version now.

* updated.
*/
protected final List<IFormula> evaluatedTerms = new ArrayList<>();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a bigger issue for Princess.
Can we add a plain test for Princess (without this code) to see where Princess fails to be consistant?
Can we ask the authors to provide a consistant model?

Copy link
Contributor Author

@daniel-raffler daniel-raffler Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a bigger issue for Princess. Can we add a plain test for Princess (without this code) to see where Princess fails to be consistant? Can we ask the authors to provide a consistant model?

I'm afraid in this case the problem here is really with the work-around, and not with Princess. We effectively emulate (get-values (<expr>)) with a sequence of commands:

(push 1)
(assert (= <new_variable> <expr>))
(check-sat)
(get-model) 
// Now get the value from the entry for <new_variable>
(pop 1)

The problem is than the sat check causes the solver to forget the old model, and because of this it can choose a different assignment for the variables. For instance, if the only assertion on the stack is x < y then calling evaluate(x) twice may return different values as the solver simply chooses a different model each time. You can see this by replacing the code in PrincessAbstractProver.getEvaluatedTerms() with ImmutableList.of(). This should cause tests in ModelTest and SolverAllSatTest to fail as the model is not kept consistent across multiply calls to evaluate().

EDIT:
I did some testing earlier and we may not actually need the call to prover.addEvaluatedTerm() in PrincessModel.evalImpl(). Just adding the assignments from the model in PrincessAbstractProver.isUnsat() seems to be enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of PrincessAbstractProver.isUnsat() should not depend on earlier model evaluations, because it can produce a completely different model. The current implementation seems to work well, even if its runtime is not optimal. We can improve this in the future.

// Extending the partial model does not seem to work in Princess if the formula uses rational
// variables. To work around this issue we (temporarily) add the formula to the assertion
// stack and then repeat the sat-check to get the value.
if (expr instanceof ITerm) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How expensive is this workaround?

Can we ask the Princess developers for a proper evaluation? I would assume that Princess could implement this internally, including caching for repeated calls, memoization of assignments comparing across different evaluations, etc.

Copy link
Contributor Author

@daniel-raffler daniel-raffler Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How expensive is this workaround?

I've just run bmc on the sv-comp task set as a benchmark:

Screenshot From 2024-11-17 17-31-55
The brown line uses Princess with the main, and the green(?) line is for the updated version from this branch. Since we're not using floating point or string formulas, most of the difference between the two should be down to the changes in evalImpl (and possibly some changes in Princess itself). Note that the old version (from main) actually crashes on quite a number of the tasks due to #308. The workaround has the side-effect of also fixing this issue. If we filter for just the tasks that return the correct result with the Princess version from main the graph looks like this:

Screenshot From 2024-11-17 17-32-45
From those results it doesn't look like the work-around is adding too much overhead, especially as the problem size increases. It might still be worth looking into other benchmarks that don't use bmc and see if there is any significant difference there.

Can we ask the Princess developers for a proper evaluation? I would assume that Princess could implement this internally, including caching for repeated calls, memoization of assignments comparing across different evaluations, etc.

That would of course be ideal. We already have two Princess tickets (uuverifiers/princess#7 and uuverifiers/princess#8) about issues with the evaluation of rational formulas, and another one (uuverifiers/princess#17) for the crashes on integer formulas that I mentioned earlier (see #308). The first two are already quite old and I don't know what the current status is. Maybe we could just ask for an update? For the last one I was told that they will need more time for a fix as the issues doesn't seem to be entirely trivial. Because of that I would prefer to merge this branch with the workaround now and then come back later for a proper fix.

This will revert the changes made in 262bb96 to simplify development.
@daniel-raffler
Copy link
Contributor Author

Hi @daniel-raffler . I uploaded Princess 2024-11-08 and Ostrich 1.4.1 to the Sosy-Lab Ivy repository.

Great work! Thanks for your help.

PS: You can also do the Ivy publication on your own. The requirement is a Sosy-Lab SVN account with write-access to the SVN repository at https://svn.sosy-lab.org/software/ivy/ . The command lines are documented in Libraries.txt. Please note that that repository must be used in "append-only" style, i.e. you must never ever (*) modify or remove an existing library or XML file.

(*) except in very rare and really critical cases 😄

Thanks for the tip! I'll have to look into this and see if I actually have write-access to the SVN repository.

}
}
return builder.toString();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment, not blocking the merge:
Does escaping and unescaping with Strings align with all operations like String comparison and String inclusions? This feature seems to be quite complex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's a good question..

I wrote this method since many/most of the tests in StringFormulaManagerTest use (SMTLIB) escaped Unicode, which Princess does not seem to support. It should be fine for that purpose, but I'm not sure how well escaping/unescaping will work in general. This shouldn't be a problem for this pull request as we're currently only unescaping input Strings, but don't try to escape the output.

However, we might still want to have a closer look at this. To be honest I'm not even quite sure what the format of value in StringFormulaManager.makeString(String value) is supposed to be. Several of the tests use SMTLIB escape sequences here, but the argument is really just a regular Java String. Do we want allow Unicode characters in it or not? When I try makeString("\\u{39E}") different solver return different results: CVC5 and Princess recognize the escape sequence and treat it as "Ξ", whereas CVC4 and Z3 do not and simply return \u{39E} again. On the other hand makeString("Ξ") will throw and exception for CVC4/5 (because of the invalid character), however it passes for Princess, and Z3 will read the String as \u{ce}\u{9e}, which is the UTF-8 encoding of "Ξ" and doesn't seem quite right either.

Maybe it would be best to figure this out as part of another issue and then come up with a common definition? In the mean time I may also ask the Ostrich developers to add support for escape Unicode characters as they are part of the SMTLIB standard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a new issue for that: #412

Feel free to add more details there.

Copy link
Member

@kfriedberger kfriedberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me.
There are some bugs or missing features in Princess.
We can improve this in the future.

@daniel-raffler Feel free to merge this PR.

@kfriedberger kfriedberger merged commit ad84a68 into master Nov 24, 2024
3 checks passed
@kfriedberger kfriedberger deleted the 257-more-theories-for-princess branch November 24, 2024 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Princess: No StringFormulaManager
4 participants