Rough draft of a database library, for discussion #76

qaphla · 2024-10-15T15:14:49Z

This is a first attempt at a database library, which is not intended to work as is (and includes correspondingly many assumes, and a few admits), but rather as a starting point for discussion of what such a library should look like. There are a few main design decisions that I think are reasonable, but which are open for discussion, and also several questions where I'm not sure what the best thing to do is at this point, which are more relevant for discussion.

Design

A database consists of a main state, containing a list of state IDs, each of which points to a state containing a single "row" of the database.
Rows may have an arbitrary (user-specified) type, and corresponding local predicates specifying what is a valid row and how rows may be updated (not yet implementable), along with the usual later and knowable lemmas.
The database as a whole can take in a global predicate, which will be used to build the predicate for the main state, specifying some property over the list of rows pointed to by the main state. Similar to the later and knowable lemmas, this comes with a pair of properties, making it possible to prove the global predicate for empty databases and when updating a database.
The interface allows us to:
1. Initialize a database (as empty)
2. Add new rows to an existing database
3. Update an existing row in a database (selected by a unique identifier, currently just its state ID)
4. Find a row satisfying a predicate in a database (if one exists). Currently, this finds the first such row, but could be altered to give all such rows.
Open questions
1. Currently, the database-global predicate is very general. This is, broadly, a good thing, but may overcomplicate common cases. Most of the time, I believe that the core property we want is that some portions of each database row are unique and immutable, and therefore can reliably be used as identifiers for rows, so that if we extract (via find, say) two rows from the database that agree on a unique field, then those two rows are the same, up to updates (e.g., they agree on all immutable fields, and the older is related to the newer by the (transitive closure of?) the row update predicate. It is also useful to have immutable portions of rows, but this can be expressed purely locally, as part of the row update predicate.
2. Currently, of course, we do not have any conditions on how states may be updated, so the row update predicate is not implementable. One question that comes up with this predicate is what it should have access to --- of course, the old and new state, along with the principal (and potentially the state ID), as well as the current trace at time of update. Does it also need the trace at the time that the old state was set? This may simplify expressing some properties, but it is unclear if they are actually useful properties to express --- e.g., state 1 may be updated to state 2 only if there is an event ev with a particular form between when state 1 was set and when state 2 is set.
3. How should we best get information about the global database predicate out when retrieving rows? It feels like the current form of the predicate is perhaps too general to get useful information from. This relates to 1 --- the more restricted the global predicate is, and the more aware of its form the database library is, the more we can encapsulate getting information out of it, but it may also make specifying the parameters of the database more unwieldy.
4. In principle, we could store additional information in the main database, instead of just a list of state IDs --- for instance, we could use a different unique identifier for database rows, so that the state IDs used by the rows are not revealed to the user, but then this doesn't actually give any better guarantees than just using state IDs --- the state could be arbitrarily updated according to its predicates regardless.

src/lib/state/DY.Lib.State.Database.fst

qaphla · 2024-10-15T15:17:33Z

src/lib/state/DY.Lib.State.Database.fst

+  #row_t:Type0 -> {|db_types row_t|} ->
+  {|protocol_invariants|} -> db_pred:db_predicate row_t -> prop
+let has_db_invariants #row_t #db_t #invs db_pred =
+  let db_local_state : local_state db = local_state_db row_t in


I've tried to avoid needing this, but for some reason, can't get typeclass inference to work property to find an instance of local_state db, probably because db doesn't depend on the row type. Maybe the db_types need to be split into two parts, one that depends on the row type and one that doesn't?

qaphla · 2024-10-15T15:21:48Z

Another general question is if we want to include support for deletion in some form from databases. My impression is that in most cases, this causes additional problems without much benefit, because uniqueness becomes much less useful in the presence of deletions, but I would be curious to hear if someone has applications for it. Of course, we can "fake" deletion by having a "deleted" flag as part of a row, and always query only for rows that are not deleted, which may be sufficient for any such applications, without needing a change to the library itself.

TWal · 2024-10-15T16:34:49Z

I will have a deeper look later, but overall it seems that this could replace DY.Lib.State.Map as it is more general? The corruption model of the database is more fine-grained than DY.Lib.State.Map which seems good to have.

qaphla · 2024-10-15T17:44:37Z

I will have a deeper look later, but overall it seems that this could replace DY.Lib.State.Map as it is more general? The corruption model of the database is more fine-grained than DY.Lib.State.Map which seems good to have.

Yes, I believe that this is (will be, once it works) a strict generalization of DY.Lib.State.Map in its current form, particularly with regard to corruption, but also with more general search options. That said, it might be worth building a thin layer over it for the specific use case of a key-value map (or maybe a multi-key-value map, or similar), so that the user can, for instance, avoid needing to manually write down the global predicate that specifies that keys are unique, and other such simplifications.

…tar-extrinsic into qaphla/database_lib

Draft (extremely rough) of a database library

3d1a5bf

qaphla commented Oct 15, 2024

View reviewed changes

src/lib/state/DY.Lib.State.Database.fst Outdated Show resolved Hide resolved

qaphla commented Oct 15, 2024

View reviewed changes

src/lib/state/DY.Lib.State.Database.fst Outdated Show resolved Hide resolved

qaphla commented Oct 15, 2024

View reviewed changes

Merge branch 'main' into qaphla/database_lib

5d5bf41

Klaas Pruiksma added 3 commits October 16, 2024 09:36

Cleanup of a few comments

2e450ae

Merge branch 'qaphla/database_lib' of github.com:REPROSEC/dolev-yao-s…

b10c950

…tar-extrinsic into qaphla/database_lib

Small cleanup

53a172d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rough draft of a database library, for discussion #76

Rough draft of a database library, for discussion #76

qaphla commented Oct 15, 2024

qaphla Oct 15, 2024

qaphla commented Oct 15, 2024

TWal commented Oct 15, 2024

qaphla commented Oct 15, 2024

Rough draft of a database library, for discussion #76

Are you sure you want to change the base?

Rough draft of a database library, for discussion #76

Conversation

qaphla commented Oct 15, 2024

Design

Open questions

qaphla Oct 15, 2024

Choose a reason for hiding this comment

qaphla commented Oct 15, 2024

TWal commented Oct 15, 2024

qaphla commented Oct 15, 2024