-
-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transient indices might lead to data corruption #373
Comments
Unfortunately my hoped-for workaround didn't help, so these records are being corrupted even when they do not have unique indices attached to them, and so far I do not see anything that distinguishes them. There are only 1000-2000 entities of this type created in the transaction. Other record types are creating more in some cases without trouble. The total is about 5000 entities in the one of the larger transactions that I'm looking at. Any change I make to the application changes the set of specific impacted entities, but if I make no changes and re-run, the results set is identical, including errors. Here is a "small" sample that I hope shows what's happening a bit better. I'll try to explain it: This is the code I'm using to identify bad records. In my system, every entity should have a This iterates the program one step at a time, and after each step it runs the (defn get-weird-ids [results]
(->> results
(map (fn [[{:keys [db-after]}]]
(mapv #(d/entity db-after %)
(distinct (map #(.e %) (d/datoms db-after :eavt))))))
(map-indexed #(vector %1 (->> %2
(filter (complement :type)) ;; every entity should have :type
(map d/touch))))))
(map second
(reductions (fn [[results weird-ids] n]
;; events/step steps the system forward and produces a single large transaction.
(let [results (conj results (events/step conn :tx-report))]
[results (get-weird-ids results)]))
[(create-sample) (get-weird-ids results)]
(range 6))) This is the result of the above script. You can see that for the first 8 cycles there is no problem, but after that, suddenly the data in several databases is changed/corrupted. (([0 ()] [1 ()] [2 ()] [3 ()] [4 ()])
([0 ()] [1 ()] [2 ()] [3 ()] [4 ()] [5 ()])
([0 ()] [1 ()] [2 ()] [3 ()] [4 ()] [5 ()] [6 ()])
([0 ()] [1 ()] [2 ()] [3 ()] [4 ()] [5 ()] [6 ()] [7 ()])
([0 ()] [1 ()] [2 ()] [3 ()] [4 ()] [5 ()] [6 ()] [7 ()] [8 ()])
([0 ()] [1 ()] [2 (#:db{:id 16270})] [3 (#:db{:id 16270})] [4 (#:db{:id 16270})] [5 (#:db{:id 16270})] [6 (#:db{:id 16270})] [7 (#:db{:id 16270})] [8 (#:db{:id 16270})] [9 ()])
([0 ()] [1 ()] [2 (#:db{:id 16270})] [3 (#:db{:id 16270})] [4 (#:db{:id 16270})] [5 (#:db{:id 16270})] [6 (#:db{:id 16270})] [7 (#:db{:id 16270})] [8 (#:db{:id 16270})] [9 ()] [10 ()])
([0 ()]
[1 ()]
[2 (#:db{:id 16270})]
[3 (#:db{:id 16270})]
[4 (#:db{:id 16270})]
[5 (#:db{:id 16270})]
[6 (#:db{:id 16270})]
[7 (#:db{:id 16270})]
[8 (#:db{:id 16270})]
[9 ()]
[10 ({:item/link #:db{:id 7227}, :item/new-limit 3, :item/container #:db{:id 5220}, :item/slot-num 59, :item/type :two, :db/id 15448})]
[11 ()]) This final structure is data relevant to each eid found above. The structure is {eid {:v #{versions impacted}, :t {version-number [changed datoms in the tx that created that version]}} You can see that the entities are not modified in the transactions, and that the entities above should all have all of their properties. (edit: I originally included the wrong data here) {16270
{:v #{7 4 6 3 2 5 8},
:t
{1
[#datascript/Datom [16270 :item/type :two 536871621 true]
#datascript/Datom [16270 :item/link 5133 536871621 true]
#datascript/Datom [16270 :item/new-limit 3 536871621 true]
#datascript/Datom [16270 :type :item 536871621 true]],
8 [#datascript/Datom [16270 :item/container 5209 536871628 true] #datascript/Datom [16270 :item/slot-num 10 536871628 true]],
9
[#datascript/Datom [16270 :type :item 536871629 false]
#datascript/Datom [16270 :item/link 5133 536871629 false]
#datascript/Datom [16270 :item/new-limit 3 536871629 false]
#datascript/Datom [16270 :item/container 5209 536871629 false]
#datascript/Datom [16270 :item/slot-num 10 536871629 false]
#datascript/Datom [16270 :item/type :two 536871629 false]]}},
15448
{:v #{10},
:t
{1
[#datascript/Datom [15448 :item/type :two 536871621 true]
#datascript/Datom [15448 :item/link 7227 536871621 true]
#datascript/Datom [15448 :item/new-limit 3 536871621 true]
#datascript/Datom [15448 :type :item 536871621 true]],
3 [#datascript/Datom [15448 :item/container 5220 536871623 true] #datascript/Datom [15448 :item/slot-num 59 536871623 true]],
11
[#datascript/Datom [15448 :type :item 536871631 false]
#datascript/Datom [15448 :item/link 7227 536871631 false]
#datascript/Datom [15448 :item/new-limit 3 536871631 false]
#datascript/Datom [15448 :item/container 5220 536871631 false]
#datascript/Datom [15448 :item/slot-num 59 536871631 false]
#datascript/Datom [15448 :item/type :two 536871631 false]]}}} In the above example the affected records are deleted in the 11 th version, but that is not always the trigger. In the below, from a different run, none of the records are deleted... {14978
{:v #{7},
:t
{1
[#datascript/Datom [14978 :item/type :two 536871621 true]
#datascript/Datom [14978 :item/link 8666 536871621 true]
#datascript/Datom [14978 :item/new-limit 3 536871621 true]
#datascript/Datom [14978 :type :item 536871621 true]]}},
16068
{:v #{6 5},
:t
{1
[#datascript/Datom [16068 :item/type :two 536871621 true]
#datascript/Datom [16068 :item/link 8187 536871621 true]
#datascript/Datom [16068 :item/new-limit 3 536871621 true]
#datascript/Datom [16068 :type :item 536871621 true]],
2 [#datascript/Datom [16068 :item/container 5220 536871622 true] #datascript/Datom [16068 :item/slot-num 5 536871622 true]]}},
16396
{:v #{7 6 5},
:t
{1
[#datascript/Datom [16396 :item/type :two 536871621 true]
#datascript/Datom [16396 :item/link 9201 536871621 true]
#datascript/Datom [16396 :item/new-limit 3 536871621 true]
#datascript/Datom [16396 :type :item 536871621 true]]}},
16388
{:v #{7 6 5},
:t
{1
[#datascript/Datom [16388 :item/type :two 536871621 true]
#datascript/Datom [16388 :item/link 7892 536871621 true]
#datascript/Datom [16388 :item/new-limit 3 536871621 true]
#datascript/Datom [16388 :type :item 536871621 true]]}},
16288
{:v #{7 6 5},
:t
{1
[#datascript/Datom [16288 :item/type :two 536871621 true]
#datascript/Datom [16288 :item/link 6751 536871621 true]
#datascript/Datom [16288 :item/new-limit 3 536871621 true]
#datascript/Datom [16288 :type :item 536871621 true]],
3 [#datascript/Datom [16288 :item/container 5249 536871623 true] #datascript/Datom [16288 :item/slot-num 108 536871623 true]]}},
16120
{:v #{5},
:t
{1
[#datascript/Datom [16120 :item/type :two 536871621 true]
#datascript/Datom [16120 :item/link 9296 536871621 true]
#datascript/Datom [16120 :item/new-limit 3 536871621 true]
#datascript/Datom [16120 :type :item 536871621 true]]}},
14404
{:v #{7},
:t
{1
[#datascript/Datom [14404 :item/type :two 536871621 true]
#datascript/Datom [14404 :item/link 7906 536871621 true]
#datascript/Datom [14404 :item/new-limit 3 536871621 true]
#datascript/Datom [14404 :type :item 536871621 true]],
3 [#datascript/Datom [14404 :item/container 5351 536871623 true] #datascript/Datom [14404 :item/slot-num 108 536871623 true]]}},
14974
{:v #{7},
:t
{1
[#datascript/Datom [14974 :item/type :two 536871621 true]
#datascript/Datom [14974 :item/link 4831 536871621 true]
#datascript/Datom [14974 :item/new-limit 3 536871621 true]
#datascript/Datom [14974 :type :item 536871621 true]],
2 [#datascript/Datom [14974 :item/container 5218 536871622 true] #datascript/Datom [14974 :item/slot-num 15 536871622 true]]}},
16106
{:v #{6 5},
:t
{1
[#datascript/Datom [16106 :item/type :two 536871621 true]
#datascript/Datom [16106 :item/link 6952 536871621 true]
#datascript/Datom [16106 :item/new-limit 3 536871621 true]
#datascript/Datom [16106 :type :item 536871621 true]],
2 [#datascript/Datom [16106 :item/container 5226 536871622 true] #datascript/Datom [16106 :item/slot-num 8 536871622 true]]}},
16270
{:v #{4 3 2},
:t
{1
[#datascript/Datom [16270 :item/type :two 536871621 true]
#datascript/Datom [16270 :item/link 5133 536871621 true]
#datascript/Datom [16270 :item/new-limit 3 536871621 true]
#datascript/Datom [16270 :type :item 536871621 true]]}},
16394
{:v #{7 6 5},
:t
{1
[#datascript/Datom [16394 :item/type :two 536871621 true]
#datascript/Datom [16394 :item/link 8001 536871621 true]
#datascript/Datom [16394 :item/new-limit 3 536871621 true]
#datascript/Datom [16394 :type :item 536871621 true]]}}} ;; These are the only relevant pieces of the schema for the affected records. The :type property is not in the schema.
(d/create-conn {:item/container {:db/type :db.type/ref
:db/cardinality :db.cardinality/one}
:item/link {:db/type :db.type/ref
:db/cardinality :db.cardinality/one}) |
Can you try with one of this jar or this patch (both are inside archive), whatever is easier? |
I just tested it and the patch does fix the problem. |
Ok, thank you. I pushed 1.0.3 for now that disables transient indices. Later I’ll take a more detailed look into it. Please leave this issue open |
Thank you very much! |
@pangloss can you provide a reproducible example? |
Hi @darkleaf, My previous quick attempt to make a very simple standalone repro to share didn't work. There's no way I could share any part of the actual database that I encountered the problems in, unfortunately, but I'll try to describe the conditions it appears in. It'll be a while before I have a chance to work on this since I've got a pretty big backlog at the moment, unfortunately. In the application schema I have a few properties marked The other is an index to future events The future-timestamp also has a bunch of entities related to it, most of those get deleted in the same transaction the future-timestamp is deleted except the ones that were created in the beginning. In addition to the future-timestamps and related entities, every transaction has a handful of regular entities changed, added or removed. Most transactions have 50-100 datoms in the resulting tx-data, but sometimes it can be 1000-2000. The data is highly connected, so every change includes changes to attributes marked When I run it, I save all of the resulting transaction reports and eventually run analysis over them in sequential order. Most runs have 100-500 tx-results. Hopefully that helps... Cheers, |
Thanks, but it's still hard to understand for me. I have a guess. You have an instance of DB, you change it in several ways several times, and then you run a query against the initial instance and it has its state changed, but it's wrong because DB is an immutable value. Am I right? |
Yes, data changes and becomes invalid in past versions of the database even though the latest version always stays correct. |
@darkleaf issue probably was in transient indices + restarting transaction after tempid resolve (db/retry-with-tempid). Note: I disabled transient indices in latest version. |
Possibly related to @tonsky's comment, I do have some tx functions that are explicitly scheduled at the end of the transaction so that they can query the partially committed db to resolve possible conflicts. |
@tonsky would this issue also affect the ClojureScript Datascript implementation? Over at RoamResearch we were looking into it and didn't seem to find many of these blocks that had lost attributes. We also dug a bit into asTransient and it looks like it's not doing anything in CLJS impl, but is doing something in the CLJ impl. Didn't want to assume though, so thought we'd ask. |
@filipesilva They are not implemented in CLJS, no need to worry. JVM version is now fine too, I turned them off, will fix later |
Hi, I'm encountering an issue where data is disappearing or reappearing in historical versions of my db when I do subsequent transactions against the current version. It only happens with entities which have a
{:db/unique :db.unique/identity}
property.It seems to be a data structure bug in datascript.
I can run a few transactions, query a specific instance of the db history, then do additional transactions and when I query the same instance from the history, the data will have changed.
I'm running Clojure 1.10.1 on Java 11 and have confirmed it with both v1.0.1 and v1.0.2.
I'm hoping you'll have an idea what may be causing this!
Although I can reliably reproduce it in the context of my application (it happens every time), I so far cannot figure out exactly how to reproduce it in isolation, so can't provide you a repro right now, unfortunately.
Suggestions as to how to approach reproducing the bug would be welcome. Meanwhile I think a workaround will be to isolate indexed properties into independent entities that just point to the actual data...
The text was updated successfully, but these errors were encountered: