Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counterexamples As Assumptions #5013

Merged
merged 86 commits into from
Apr 9, 2024

Conversation

Dargones
Copy link
Collaborator

@Dargones Dargones commented Jan 25, 2024

Problem

Right now Dafny reports counterexamples using special syntax that may be difficult to understand. For example, consider the following Dafny program that defines a standard List datatype with a function View that maps the list to a sequence of integers:

datatype Node = Cons(next:Node, value:int) | Nil {
  function View():seq<int> 
    ensures Cons? ==> |View()| > 0 && View()[0] == value && View()[1..] == next.View()
  {
    if Nil? then [] else [value] + next.View()
  }
}

Suppose we were to (incorrectly) assert that the list cannot correspond to the sequence of integers [1, 2, 3], like so:

method m(list:Node) {
  assert list.View() != [1, 2, 3];
}

Currently, Dafny would return the following counterexample:

At "method m(list:Node) {" (file.dfy:20):
    list:Problem.Node = Cons(next := @0, value := 1)
    @0:Problem.Node = Cons(next := @2, value := 2)
    @2:Problem.Node = Cons(next := @4, value := 3)
    @4:Problem.Node = Nil

This counterexample is confusing because it does not explain what the meaning of @0, @1, etc. is. The notation is different from the Dafny syntax, and it also does not capture some of the information that is actually contained within the counterexample because there is no way to express this information using this custom syntax. In particular, the counterexample constrains the value returned by calling View() on the list variable. This constrain might be redundant in this particular example but in general, we would want to capture all the information contained in the counterexample.

Solution

This PR redesigns the counterexample generation functionality so that counterexamples are represented internally and can be printed as Dafny assumptions. For example, for the program above, the counterexample will now look like this:

assume Node.Cons(Node.Cons(Node.Cons(Node.Nil, 3), 2), 1) == list 
    && Node.Cons(Node.Cons(Node.Cons(Node.Nil, 3), 2), 1).View.requires() 
    && [1, 2, 3] == Node.Cons(Node.Cons(Node.Cons(Node.Nil, 3), 2), 1).View() 
    && Node.Cons(Node.Cons(Node.Nil, 3), 2).View.requires() 
    && [2, 3] == Node.Cons(Node.Cons(Node.Nil, 3), 2).View() 
    && Node.Cons(Node.Nil, 3).View.requires() 
    && [3] == Node.Cons(Node.Nil, 3).View() 
    && Node.Nil.View.requires() 
    && [] == Node.Nil.View();

While admittedly more verbose because it explores the return values of functions, this counterexample precisely constrains the argument that leads to the assertion violation. In particular, if you were to insert this assumption into the code and revert the assertion, Dafny would verify that this counterexample is correct.

At a high level, this makes the following changes:

  1. Represent counterexamples as Dafny statements that the user can insert directly into their code to debug the failing assertion.
  2. Make sure the counterexamples are wellformed, i.e. the constraints are ordered in such a way that the resulting assumption verifies.
  3. Support counterexample constraints over function return values (as an example of something that can really only be done using native Dafny syntax)
  4. Automatically fix counterexamples that are internally inconsistent, when such an inconsistency can be easily detected. For instance, a counterexample will never describe negative indices of an array or sequence, call a set empty if it contains elements, etc. All of these are possible in the counterexample model returned by Boogie but we prune such inconsistencies before reporting the counterexample, when possible.

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

@Dargones
Copy link
Collaborator Author

Dargones commented Mar 20, 2024

Edit: right now, one can only get a counterexample in the form of assumptions. To enable test generation behavior, the counterexample model would need to be cross-referenced with the source code, so this is something that could be left as a future PR.

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Mar 21, 2024

Edit: right now, one can only get a counterexample in the form of assumptions. To enable test generation behavior, the counterexample model would need to be cross-referenced with the source code, so this is something that could be left as a future PR.

I think that would be valuable, because if it can be generated, and it does cause an assertion to fail at runtime, it provides a guaranteed counter-example, instead of a counter-example guess (the model produced after an unknown result). It seems like the only way we're aware of to determine if a counter-example guess is correct.

@Dargones
Copy link
Collaborator Author

Have just pushed a commit that adds a warning to notify the user that the counterexample may not be valid due to the solver returning UNKOWN. I would prefer to leave counterexample execution for future work, since it will be a substantial change/addition to the PR. Executing a counterexample is, I think, indeed the only certain way to determine that the counterexample is valid but it is also not clear how applicable such a feature would be (a lot of Dafny code is not intended for execution and code that is intended for execution can fail verification for reasons that would not come up at runtime).

Note that Danfy cannot guarantee that the counterexample
it reports provably violates the assertion or that the assumptions are not
mutually inconsistent (see [^smt-encoding]), so this output should be
expected manually and treated as a hint.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected => inspected

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Mar 22, 2024

Have just pushed a commit that adds a warning to notify the user that the counterexample may not be valid due to the solver returning UNKOWN. I would prefer to leave counterexample execution for future work, since it will be a substantial change/addition to the PR. Executing a counterexample is, I think, indeed the only certain way to determine that the counterexample is valid but it is also not clear how applicable such a feature would be (a lot of Dafny code is not intended for execution and code that is intended for execution can fail verification for reasons that would not come up at runtime).

Sounds good, thanks. I'm not sure what the best terminology is for these potential counterexamples, but we can iterate on that.

where to write the counterexample, as well as the
`-proverOpt:O:model_compress=false` and
`-proverOpt:O:model.completion=true` options.
* `-extractCounterexample` - if verification fails, report a possible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subjective nitpick: possible => potential

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@atomb atomb added this pull request to the merge queue Apr 8, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Apr 8, 2024
@atomb atomb added this pull request to the merge queue Apr 8, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Apr 8, 2024
@Dargones Dargones mentioned this pull request Apr 9, 2024
@atomb atomb enabled auto-merge (squash) April 9, 2024 15:28
@keyboardDrummer
Copy link
Member

keyboardDrummer commented Oct 21, 2024

@Dargones This PR changes the counterexample API for dafny server. Do you know if it would be possible to get back the old API ? I think for the IDE an API that only shows either concrete values or holes, would be ideal.

I imagine an API like;

record CounterExampleItem(Position Position, IDictionary<string, Value> Variables, IDictionary<int, Value> Heap);
interface Value;
record Primitive(string content) : Value;
record Unknown() : Value;
record Reference(int number) : Value;
record StructuredValue(string description, IDictionary<string, Value> children) : Value; // description could be the name of the type, or just "object"

At some point I think counterexamples in the IDE should be shown using the debugger UI, which would mean we'd have to implement the debug adapter protocol, which has an API not too different from the one above:

https://microsoft.github.io/debug-adapter-protocol/specification#Types_Variable

@Dargones
Copy link
Collaborator Author

HI, @keyboardDrummer! Here is a quick PR I made that should fix this: #5847 I agree that in the long term it would be great to use the debugger UI, though this might take one/two weeks of full time work...This PR (as in the one I am leaving the comment on) does a lot of heavy lifting towards that goal, I think.

@keyboardDrummer
Copy link
Member

HI, @keyboardDrummer! Here is a quick PR I made that should fix this: #5847 I agree that in the long term it would be great to use the debugger UI, though this might take one/two weeks of full time work...This PR (as in the one I am leaving the comment on) does a lot of heavy lifting towards that goal, I think.

Amazing, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants