Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot understand why certain expressions are not instrumented/reported as visited #423

Closed
nojb opened this issue Jul 15, 2023 · 14 comments
Closed

Comments

@nojb
Copy link

nojb commented Jul 15, 2023

Hello, thank you for maintaining this nice tool!

I have been playing with it and am a bit puzzled why the HTML report does not show certain lines as being visited, eg 293, 297, 303 below. I looked at the code briefly but did not find anything that could explain the omission. For 293 at least the instrumentation is present:

image

Any ideas?

image

Another example in 313, 318 below, but in this case, the instrumentation on 313 at least is not present:

image

Any ideas?

image

Thanks!

@aantron
Copy link
Owner

aantron commented Jul 16, 2023

I'm actually not immediately sure why some of those lines are instrumented to begin with. I'm not completely sure what the expectation is -- do you expect all lines to be marked?

Bisect doesn't mark lines, but only essentially out-edges in the control-flow graph that MIGHT not be visited, and assigns each one some meaningful place in the source file to visually represent it. In the case of field assignment expression, IIRC it can't fail, so Bisect does not instrument it, because if the machine begins executing it (visit it in the CFG) it can't fail to proceed to the out-edge.

@aantron
Copy link
Owner

aantron commented Jul 16, 2023

The main complication here is that the presence of exceptions in the language generates additional out-edges, which can cause what would otherwise appear as straight-line code to have non-trivial structure in the CFG, but I don't think (IIRC) this applies to these field assignments.

@aantron
Copy link
Owner

aantron commented Jul 16, 2023

Actually, what is .!?

@aantron
Copy link
Owner

aantron commented Jul 16, 2023

What's happening is that .! is a ternary function, and Bisect can't guarantee that its application evaluates to a value. So it instruments the out-edge of the application at line 303 by placing a point visually on the start of line 304 (the in-edge of that "block"), and likewise, the out-edge of the application at line 304 is visually marked at line 305. The out-edge of the application at line 305 coincides with the start of each loop iteration. Is line 309 in the next image from the same file? If so, there might be a bug there in the instrumentation of for loops, because the in-edge on line 310 might have to be instrumented.

@aantron
Copy link
Owner

aantron commented Jul 16, 2023

Bisect has a list of special function symbols that it assumes are bound to functions whose application evaluates to a value, like +, to avoid generating too much of this kind of instrumentation at every function application. Of course, these assumptions can be violated by user-defined functions that shadow the standard ones.

@nojb
Copy link
Author

nojb commented Jul 17, 2023

Bisect doesn't mark lines, but only essentially out-edges in the control-flow graph that MIGHT not be visited, and assigns each one some meaningful place in the source file to visually represent it.

Thanks, I hadn't grasped this, which explains many of the issues here.

Actually, what is .!?

Sorry, I should have clarified that. .!() is an alias for Float.Array.get, and .!()<- an alias for Float.Array.set.

Is line 309 in the next image from the same file? If so, there might be a bug there in the instrumentation of for loops, because the in-edge on line 310 might have to be instrumented.

Yes, it is from the same file.

@nojb
Copy link
Author

nojb commented Jul 17, 2023

Is line 309 in the next image from the same file? If so, there might be a bug there in the instrumentation of for loops, because the in-edge on line 310 might have to be instrumented.

Yes, it is from the same file.

I think it is actually fine, the for loop is under a lambda:

let f _ =
  ...
  for ... do
  done
in
(* line 310 *)

so it is OK for line 310 not to be instrumented.

@nojb
Copy link
Author

nojb commented Jul 17, 2023

I think most of my questions have been cleared up. Thanks!

@nojb nojb closed this as completed Jul 17, 2023
@aantron
Copy link
Owner

aantron commented Jul 18, 2023

Great! Indeed, then I think Bisect is instrumenting everything correctly here. Float.Array.get and Float.Array.set can raise exceptions (in the general case, absent a proof for particular call site).

The application expression at the end of a for loop at the end of lambda can, I think, be instrumented, though not at line 310. I think Bisect should insert post-application instrumentation at line 305. I'll have to take a look at that. Bisect tries to instrument after an application expression, but not when the application expression is in tail position, because that would ruin tail call optimization and change the stack semantics of programs. However, this application expression is not in tail position, because it is inside a for loop. I'll create a separate issue for that later.

Thank you!

@nojb
Copy link
Author

nojb commented Jul 18, 2023

Bisect tries to instrument after an application expression, but not when the application expression is in tail position, because that would ruin tail call optimization and change the stack semantics of programs.

Related to that, it may be a good idea to give the same treatment to TRMC calls as the instrumentation currently inhibits the transformation (and triggers a warning 71).

@aantron
Copy link
Owner

aantron commented Jul 21, 2023

Thank you! I was not aware this was added to the compiler. Created #424 to track it.

@nojb
Copy link
Author

nojb commented Jul 26, 2023

Coming back to this after a chat with a colleague: can you explain why the current strategy (marking out-edges of the control flow graph) is used instead of a more naïve one (marking every executed expression)? There's probably a simple answer to this, but the current strategy seems a bit less intuitive and harder to explain, so I would like to understand the reason for this strategy better. Thanks!

@aantron
Copy link
Owner

aantron commented Jul 27, 2023

Marking every expression...

(1) Makes the report difficult to present and read, because OCaml programs usually have a relatively large number of expressions on every line, nested within each other. Many of these expressions are executed if and only if their "predecessor" expressions are executed, so the points would be redundant anyway.

(2) There is a small, but eventually significant, performance impact to instrumentation. Given that marking every expression would be redundant as suggested in (1), not marking every expression is a helpful performance improvement.

@nojb
Copy link
Author

nojb commented Jul 27, 2023

Marking every expression...

Right, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants