Add spec of the Disruptor concurrency library. #150

nicholassm · 2024-09-13T20:39:23Z

The Disruptor is a concurrency library originally developed and open sourced by LMAX Exchange for low latency communication via a ring buffer between producer and consumer threads.

This PR adds a spec of the Disruptor lib and verifies that data races do not occur.

Signed-off-by: Nicholas Schultz-Møller <[email protected]>

ahelwer · 2024-09-13T21:02:44Z

Nice spec! Thanks for the contribution.

nicholassm · 2024-09-13T21:08:16Z

Thanks, you're welcome. :-)

nicholassm · 2024-09-13T22:13:13Z

I can see a lot of checks failing due to different number of states, distinct states, etc. compared to what I get when running the toolbox. Can it be a TLC version issue? Or symmetry vs. non-symmetry sets for model values that differ somehow?

Kind regards
Nicholas

ahelwer · 2024-09-14T01:54:09Z

Those json fields are optional, although useful as a regression test. I will take a look tomorrow.

nicholassm · 2024-09-14T10:43:27Z

It looks like it's not my model that fails but a module called MCtcp that results in an exit code 1... What gives?

ahelwer · 2024-09-14T14:51:08Z

Sorry, the spec checking has debug output enabled so it's very verbose. You want to search for the text ERROR:root in the raw log. @lemmy should we keep debug logging enabled for model-checking in the CI? It makes it difficult to sift through the output.

Anyway here is the result:

2024-09-13T21:10:08.0204980Z INFO:root:specifications/Disruptor/Disruptor_MPMC.cfg
2024-09-13T21:10:08.2670270Z INFO:root:specifications/Disruptor/Disruptor_MPMC.cfg in 0.3s vs. 10s expected
2024-09-13T21:10:08.2671430Z ERROR:root:Model specifications/Disruptor/Disruptor_MPMC.cfg expected result success but got 255
2024-09-13T21:10:08.2674970Z ERROR:root:java -enableassertions -Dtlc2.TLC.ide=Github -Dutil.ExecutionStatisticsCollector.id=abcdef60f238424fa70d124d0c77ffff -XX:+UseParallelGC -cp deps/tools/tla2tools.jar:deps/apalache/bin/apalache-mc/lib/apalache.jar:deps/community/modules.jar:deps/tlapm/library tlc2.TLC specifications/Disruptor/Disruptor_MPMC.tla -config specifications/Disruptor/Disruptor_MPMC.cfg -workers auto -lncheck final -cleanup
2024-09-13T21:10:08.2677880Z TLC2 Version 2.20 of Day Month 20?? (rev: caf0c33)
2024-09-13T21:10:08.2680380Z Running breadth-first search Model-Checking with fp 31 and seed -471696565524114047 with 3 workers on 3 cores with 1593MB heap and 64MB offheap memory [pid: 5082] (Mac OS X 14.6.1 aarch64, Eclipse Adoptium 17.0.12 x86_64, MSBDiskFPSet, DiskStateQueue).
2024-09-13T21:10:08.2682070Z Error: TLC threw an unexpected exception.
2024-09-13T21:10:08.2682650Z This was probably caused by an error in the spec or model.
2024-09-13T21:10:08.2683300Z See the User Output or TLC Console for clues to what happened.
2024-09-13T21:10:08.2683890Z The exception was a tlc2.tool.ConfigFileException
2024-09-13T21:10:08.2684460Z : TLC found an error in the configuration file at line 5
2024-09-13T21:10:08.2685000Z It was expecting }, but did not find it.
2024-09-13T21:10:08.2685510Z Finished in 00s at (2024-09-13 21:10:08)
2024-09-13T21:10:08.2685800Z 
2024-09-13T21:10:08.2686070Z INFO:root:specifications/Disruptor/Disruptor_MPMC_liveliness.cfg
2024-09-13T21:10:08.5685730Z INFO:root:specifications/Disruptor/Disruptor_MPMC_liveliness.cfg in 0.3s vs. 10s expected
2024-09-13T21:10:08.5687240Z ERROR:root:Model specifications/Disruptor/Disruptor_MPMC_liveliness.cfg expected result success but got 255
2024-09-13T21:10:08.5691920Z ERROR:root:java -enableassertions -Dtlc2.TLC.ide=Github -Dutil.ExecutionStatisticsCollector.id=abcdef60f238424fa70d124d0c77ffff -XX:+UseParallelGC -cp deps/tools/tla2tools.jar:deps/apalache/bin/apalache-mc/lib/apalache.jar:deps/community/modules.jar:deps/tlapm/library tlc2.TLC specifications/Disruptor/Disruptor_MPMC.tla -config specifications/Disruptor/Disruptor_MPMC_liveliness.cfg -workers auto -lncheck final -cleanup
2024-09-13T21:10:08.5695900Z TLC2 Version 2.20 of Day Month 20?? (rev: caf0c33)
2024-09-13T21:10:08.5698330Z Running breadth-first search Model-Checking with fp 130 and seed -5831432232768886009 with 3 workers on 3 cores with 1593MB heap and 64MB offheap memory [pid: 5083] (Mac OS X 14.6.1 aarch64, Eclipse Adoptium 17.0.12 x86_64, MSBDiskFPSet, DiskStateQueue).
2024-09-13T21:10:08.5700840Z Error: TLC threw an unexpected exception.
2024-09-13T21:10:08.5701540Z This was probably caused by an error in the spec or model.
2024-09-13T21:10:08.5702440Z See the User Output or TLC Console for clues to what happened.
2024-09-13T21:10:08.5703240Z The exception was a tlc2.tool.ConfigFileException
2024-09-13T21:10:08.5704050Z : TLC found an error in the configuration file at line 5
2024-09-13T21:10:08.5704850Z It was expecting }, but did not find it.
2024-09-13T21:10:08.5705510Z Finished in 00s at (2024-09-13 21:10:08)
2024-09-13T21:10:08.5705920Z 
2024-09-13T21:10:08.5706210Z INFO:root:specifications/Disruptor/Disruptor_SPMC.cfg
2024-09-13T21:10:08.8707050Z INFO:root:specifications/Disruptor/Disruptor_SPMC.cfg in 0.3s vs. 10s expected
2024-09-13T21:10:08.8708400Z ERROR:root:Model specifications/Disruptor/Disruptor_SPMC.cfg expected result success but got 255
2024-09-13T21:10:08.8712860Z ERROR:root:java -enableassertions -Dtlc2.TLC.ide=Github -Dutil.ExecutionStatisticsCollector.id=abcdef60f238424fa70d124d0c77ffff -XX:+UseParallelGC -cp deps/tools/tla2tools.jar:deps/apalache/bin/apalache-mc/lib/apalache.jar:deps/community/modules.jar:deps/tlapm/library tlc2.TLC specifications/Disruptor/Disruptor_SPMC.tla -config specifications/Disruptor/Disruptor_SPMC.cfg -workers auto -lncheck final -cleanup
2024-09-13T21:10:08.8716690Z TLC2 Version 2.20 of Day Month 20?? (rev: caf0c33)
2024-09-13T21:10:08.8719090Z Running breadth-first search Model-Checking with fp 9 and seed 5823395496300575269 with 3 workers on 3 cores with 1593MB heap and 64MB offheap memory [pid: 5084] (Mac OS X 14.6.1 aarch64, Eclipse Adoptium 17.0.12 x86_64, MSBDiskFPSet, DiskStateQueue).
2024-09-13T21:10:08.8721110Z Error: TLC threw an unexpected exception.
2024-09-13T21:10:08.8721820Z This was probably caused by an error in the spec or model.
2024-09-13T21:10:08.8722650Z See the User Output or TLC Console for clues to what happened.
2024-09-13T21:10:08.8723490Z The exception was a tlc2.tool.ConfigFileException
2024-09-13T21:10:08.8724220Z : TLC found an error in the configuration file at line 6
2024-09-13T21:10:08.8724960Z It was expecting }, but did not find it.
2024-09-13T21:10:08.8726340Z Finished in 00s at (2024-09-13 21:10:08)
2024-09-13T21:10:08.8726770Z

I believe the reason for the failure is that in your model files you have many sets defined as:

Writers = { w1 w2 }

but this needs to be comma-delimited like:

Writers = { w1, w2 }

I suggest running TLC locally against these model files (will probably need to be done using the command line) to save time on the debug loop; debug-via-CI-run is very time-consuming!

muenchnerkindl

Thank you for the nice contribution. You'll find a few comments in the individual files (comments for the multi-writer version are analogous to those for the single-writer version). I am looking forward to seeing this spec added to the collection!

specifications/Disruptor/RingBuffer.tla

muenchnerkindl · 2024-09-14T14:41:21Z

specifications/Disruptor/Disruptor_SPMC.tla

+CONSTANTS
+  Writers,      (* Writer/publisher thread ids.    *)
+  Readers,      (* Reader/consumer  thread ids.    *)
+  MaxPublished, (* Max number of published events. *)


I presume this constant is only relevant for model checking? It would be cleaner to separate the logical spec from the bounds imposed for model checking and either add a state constraint such as published < MaxPublished in the cfg file or write a MC version of the spec that adds extra guards to actions.

Yes and no. I've investigated and the constant does bound the model. So I agree - that would be better to have as a state constraint. But then I can't model check the liveliness property in the model (that all consumers eventually always read all published events) as liveliness cannot be verified when state or action constraints are specified.
Is there a "clean" third option that you know of?

Beware that "hardcoding" MaxPublished doesn't change anything semantically WRT liveness checking except that it makes TLC not print the warning about action and state-constraints.

I'm not sure I understand. If I replace MaxPublished with a state-constraint, I cannot make my liveliness property fail (by adding an error in the spec).
As I read the relevant pages in Lamport's book, it's because WF_x(a) is false when adding a state-constraint because a is enabled and thus WF_x(a) is false which in turn never makes the liveliness property fail.
However, if I add the check with the MaxPublished constant as a part of an action then the action is not enabled and therefore the liveliness property can fail (if it's wrong).

Apologies for the late reply: now I am confused. First, I would rather have defined the liveness property as

Liveliness == \A r \in Readers : \A i \in 1 .. MaxPublished : <>[](i \in 1 .. published => Len(consumed[r]) >= i /\ consumed[r][i] = i-1)

(with the second conjunct on the right-hand side being optional). The reason is that the property that you assert will not hold if writers are allowed to continue publishing even after reaching the (artificial) bound since readers would then be able to update their consumed sequence as well, whereas the above property should always hold. (Even more general, the bound on i could be Nat but that would require an override of Nat during model checking so that TLC doesn't complain about an infinite quantifier bound, and that would be a little heavy.)

I then tried commenting out the guard next < MaxPublished in BeginWrite and adding the constraint

StateConstraint == published <= MaxPublished

for model checking. This appears to work just fine, and when I comment out the fairness condition on BeginRead, TLC gives me the expected failure of the temporal property.

I would even suggest removing the fairness condition on BeginWrite from the spec, since why should writers be required to publish items forever? The above liveness property (but of course not the original one) continues to hold for the modified spec.

Of course, it's up to you to decide what you intend as your specification.

Stephan's Liveliness property would allow to model consumed as a counter instead of a sequence (for each reader). This observation also led me to realize that the SPMC specification seems to describe the multicast configuration, since all readers are consuming every value. It might be helpful to include this in a comment for clarity.

If consumed remains a sequence, you could add the following "action property" (with IsPrefix defined at https://github.com/tlaplus/CommunityModules/blob/master/modules/SequencesExt.tla#L219-L225 :

\* We only ever append to the history variable consumed . Increments == [][\A r \in Readers: IsPrefix(consumed[r], consumed'[r])]_vars

Stephan's Liveliness property would allow to model consumed as a counter instead of a sequence (for each reader). This observation also led me to realize that the SPMC specification seems to describe the multicast configuration, since all readers are consuming every value. It might be helpful to include this in a comment for clarity.

Hi @lemmy, I wrote in the top that it's a Single Producer Multiple Consumer Disruptor - that's where I thought I made it clear that it is that configuration of the Disruptor. But perhaps I am assuming that everyone knows all consumers read all events (i.e. "multicast" behaviour) and that is not common knowledge?

Apologies for the late reply: now I am confused. First, I would rather have defined the liveness property as

Liveliness == \A r \in Readers : \A i \in 1 .. MaxPublished : <>[](i \in 1 .. published => Len(consumed[r]) >= i /\ consumed[r][i] = i-1)

(with the second conjunct on the right-hand side being optional). The reason is that the property that you assert will not hold if writers are allowed to continue publishing even after reaching the (artificial) bound since readers would then be able to update their consumed sequence as well, whereas the above property should always hold. (Even more general, the bound on i could be Nat but that would require an override of Nat during model checking so that TLC doesn't complain about an infinite quantifier bound, and that would be a little heavy.)

I then tried commenting out the guard next < MaxPublished in BeginWrite and adding the constraint

StateConstraint == published <= MaxPublished

for model checking. This appears to work just fine, and when I comment out the fairness condition on BeginRead, TLC gives me the expected failure of the temporal property.

I would even suggest removing the fairness condition on BeginWrite from the spec, since why should writers be required to publish items forever? The above liveness property (but of course not the original one) continues to hold for the modified spec.

Of course, it's up to you to decide what you intend as your specification.

I did the changes (see a new PR) and I quite like it: Getting rid of the artificial model constraint in the BeginWrite action, remove the requirement to have producers publish forever and moving the bounding of the model to a state constraint. Elegant and a better model.

I still have some work to do for the MPMC model as it's more complex to write the liveliness property because there's multiple producers and hence it more difficult to express what sequence number a consumer can actually read. (More details will follow).

Glad to hear that it worked out! I didn't look very much at the MPMC model, please let me know if you want me to.

specifications/Disruptor/Disruptor_SPMC.cfg

specifications/Disruptor/Disruptor_SPMC.tla

nicholassm · 2024-09-14T20:13:04Z

Hi @muenchnerkindl and @ahelwer - super nice with all the feedback - much appreciated.
I'll get busy incorporating your suggestions.

Signed-off-by: Nicholas Schultz-Møller <[email protected]>

…ments. Signed-off-by: Nicholas Schultz-Møller <[email protected]>

Signed-off-by: Nicholas Schultz-Møller <[email protected]>

nicholassm · 2024-09-16T19:29:19Z

Hi @muenchnerkindl, @ahelwer and @lemmy, I've fixed all issues related to running the model-checking and I think I've added all your good suggestions. Let me know if you think I can improve the contribution any further. :-)

nicholassm · 2024-09-18T18:31:24Z

Hi guys, anything else I can do to improve the contribution? While it's fresh in my head :-)
Many thanks.

specifications/Disruptor/Disruptor_SPMC.tla

lemmy · 2024-09-20T00:45:29Z

specifications/Disruptor/Disruptor_SPMC.tla

+  published,    (* Write cursor. One for the producer.               *)
+  read,         (* Read cursors. One per consumer.                   *)
+  consumed,     (* Sequence of all read events by the Readers.       *)
+  pc            (* Program Counter of each Writer/Reader.            *)


Nit: Since the value for each process only alternates between Advance and Access, I would consider renaming the variable to something more descriptive, like hasAccess.

Add spec of the Disruptor concurrency library.

3624b6e

Signed-off-by: Nicholas Schultz-Møller <[email protected]>

nicholassm force-pushed the master branch from eb57a7d to 3624b6e Compare September 13, 2024 20:47

lemmy added the enhancement label Sep 13, 2024

Fix model value in .cfg file for SPMC scenario.

956d0cd

Signed-off-by: Nicholas Schultz-Møller <[email protected]>

muenchnerkindl reviewed Sep 14, 2024

View reviewed changes

nicholassm added 5 commits September 14, 2024 22:23

Fix syntax errors, typos and don't check for deadlock.

ba9b07f

Signed-off-by: Nicholas Schultz-Møller <[email protected]>

WIP: Parameterize values in the slots, add ASSUME clauses and add com…

2df1ee1

…ments. Signed-off-by: Nicholas Schultz-Møller <[email protected]>

Improve documentation.

f6168b2

Signed-off-by: Nicholas Schultz-Møller <[email protected]>

Add ASSUME statements on Writers and Readers.

770d459

Signed-off-by: Nicholas Schultz-Møller <[email protected]>

Fix missing parenthesis.

a88e35b

Signed-off-by: Nicholas Schultz-Møller <[email protected]>

nicholassm force-pushed the master branch from 36406d2 to a88e35b Compare September 16, 2024 19:24

ahelwer merged commit 7ebd914 into tlaplus:master Sep 18, 2024
7 checks passed

lemmy reviewed Sep 20, 2024

View reviewed changes

specifications/Disruptor/Disruptor_SPMC.tla Show resolved Hide resolved

lemmy reviewed Sep 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add spec of the Disruptor concurrency library. #150

Add spec of the Disruptor concurrency library. #150

nicholassm commented Sep 13, 2024

ahelwer commented Sep 13, 2024

nicholassm commented Sep 13, 2024

nicholassm commented Sep 13, 2024

ahelwer commented Sep 14, 2024

nicholassm commented Sep 14, 2024

ahelwer commented Sep 14, 2024 •

edited

Loading

muenchnerkindl left a comment

muenchnerkindl Sep 14, 2024

nicholassm Sep 15, 2024

lemmy Sep 16, 2024

nicholassm Sep 16, 2024 •

edited

Loading

muenchnerkindl Sep 19, 2024

lemmy Sep 20, 2024

lemmy Sep 20, 2024

nicholassm Sep 21, 2024

nicholassm Sep 22, 2024

muenchnerkindl Sep 23, 2024

nicholassm commented Sep 14, 2024

nicholassm commented Sep 16, 2024

nicholassm commented Sep 18, 2024

lemmy Sep 20, 2024

Add spec of the Disruptor concurrency library. #150

Add spec of the Disruptor concurrency library. #150

Conversation

nicholassm commented Sep 13, 2024

ahelwer commented Sep 13, 2024

nicholassm commented Sep 13, 2024

nicholassm commented Sep 13, 2024

ahelwer commented Sep 14, 2024

nicholassm commented Sep 14, 2024

ahelwer commented Sep 14, 2024 • edited Loading

muenchnerkindl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nicholassm Sep 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nicholassm commented Sep 14, 2024

nicholassm commented Sep 16, 2024

nicholassm commented Sep 18, 2024

Choose a reason for hiding this comment

ahelwer commented Sep 14, 2024 •

edited

Loading

nicholassm Sep 16, 2024 •

edited

Loading