Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make meet in AddressDomain more precise #1468

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

michael-schwarz
Copy link
Member

@michael-schwarz michael-schwarz commented May 15, 2024

Currently, in the address domain inside the buckets, for the meets we use ProjectiveSetPairwiseMeet

let meet m1 m2 =
let meet_buckets b1 b2 acc =
B.fold (fun e1 acc ->
B.fold (fun e2 acc ->
if B.may_be_equal e1 e2 then
add e1 (add e2 acc)
else
acc
) b2 acc
) b1 acc
in
fold_buckets (fun _ b1 acc ->
fold_buckets (fun _ b2 acc ->
meet_buckets b1 b2 acc
) m2 acc
) m1 (empty ())

where B.may_be_equal delegates to

let may_be_equal a b = Option.value (Addr.semantic_equal a b) ~default:true

Addr.semantic_equal given by

let semantic_equal x y = match x, y with
| Addr x, Addr y -> Mval.semantic_equal x y
| StrPtr s1, StrPtr s2 -> SD.semantic_equal s1 s2
| NullPtr, NullPtr -> Some true
| UnknownPtr, UnknownPtr
| UnknownPtr, Addr _
| Addr _, UnknownPtr
| UnknownPtr, StrPtr _
| StrPtr _, UnknownPtr -> None
| _, _ -> Some false

which calls SD.sematic_equal

let semantic_equal x y =
match x, y with
| None, _
| _, None -> Some true
| Some a, Some b -> if a = b then None else Some false

this is needed to handle all sorts of different offsets that may be semantically equal, i.e., evaluate to the same physical address. In this case it keeps both elements.

After some discussion with @jerhard and @hseidl, we reached the following insights:

This behavior is overly conservative, as the mess of the address lattice contains some sub-lattices where the meet operation is semantically correct.

This now checks via a new predicate amenable_to_meet whether for the two addresses a and b this relationship between values and the meet holds. If this is the case, a meet is performed and the element added to the resulting set only if it is distinct from bot.

Currently, we answer amenable_to_meetby true whenever both arguments are strings, or when the both are addresses that differ in the numeric(!) offsets only.

This PR also adds two tests where this yields additional precision.

Closes #1467

@michael-schwarz michael-schwarz changed the title Make meet in AddressDomain more Precise Make meet in AddressDomain more precise May 15, 2024
@michael-schwarz michael-schwarz marked this pull request as ready for review May 15, 2024 14:17
Comment on lines 184 to +185
val may_be_equal: elt -> elt -> bool
val amenable_to_meet: elt -> elt -> bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if the two predicates could be combined into one more meaningful thing. Because I guess now there's supposed to be some kind of relationship between them, given that we only check them in certain order.

Would this be the correct description?

val same_sublattice: elt -> elt -> bool option
(** @return [Some true] if definitely in same sublattice, [Some false] if definitely not, [None] if unknown. *)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm skeptical here: may_be_equal is a strange semantic property for which there is no real definition in terms of lattices.

So I think using lattice terminology here will be only confusing. For example what is difference between Some false and None in your proposed definition above:

  • Even if they are definitely not from the same sublattice, they may still be equal or not
  • Same if I can't determine whether they are in the same sublattice.

So it once again boils down to the two properties above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  type eqinfo = Maybe | No | Meet
  val may_be_equal: elt -> elt -> eqinfo

One could do something like this but I don't think that is particularly nice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm skeptical here: may_be_equal is a strange semantic property for which there is no real definition in terms of lattices.

I agree and that's why I'm wondering if there is a less ad-hoc way to view these things. same_sublattice might not be the right thing, but possibly something else.

There are now two predicates who have 4 possible combined results for a pair of arguments, but only three cases are distinguished where the predicates are used.
Suppose some addresses x and y had may_be_equal x y = false before. Then neither was added to the result.
Now if amenable_to_meet x y = true, then meet x y is added to the result, which makes the meet of the entire sets larger (i.e. less precise) than before.

This makes me wonder about the order of these checks. If the previous implementation was sound, then checking may_be_equal first (to add nothing) would still be sound and cannot become less precise due to amenable_to_meet, which could then be checked second to determine whether meet x y is added or both x and y. This would be more precise than the previous implementation by construction.

Or if the case may_be_equal x y = false and amenable_to_meet x y = true should be impossible, then why? That would indicate there being some implicit connection between the two predicates (a semantic and a lattice-theoretic) after all.

@@ -110,6 +110,11 @@ struct
| StrPtr _, UnknownPtr -> None
| _, _ -> Some false

let amenable_to_meet x y = match x,y with
| StrPtr _, StrPtr _ -> true
| Addr x, Addr y when Mval.equal (Mval.top_indices x) (Mval.top_indices y) -> true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't the address set buckets already guarantee this to only be called in such cases because Mval.top_indices are the representatives?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the buckets guarantee this already, this function can be called with two arbitrary addresses, so we can not just return true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, although it could just be an implementation detail (like the buckets themselves are), which isn't exposed to the outside by the interface. Then such misuse would be impossible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Meet of string pointers in address domain imprecise
2 participants