Skip to content

Commit

Permalink
fixup! feat: schema migration
Browse files Browse the repository at this point in the history
  • Loading branch information
TimoKramer committed Mar 10, 2023
1 parent 1636a3a commit ddadd46
Show file tree
Hide file tree
Showing 3 changed files with 85 additions and 30 deletions.
63 changes: 48 additions & 15 deletions doc/schema-migration.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,56 @@
# Schema Migration or Norms
# Schema Migration

Schema migration with Datahike is the evolution of your current schema into a future schema.
We are calling the schema migrations 'norms' to avoid misunderstandings with a migration
from an older version of Datahike to a newer version.

## Why using the schema-migration tool?
You could use the `transact`-fn of the api-ns to apply your schema, but with our
`norm`-ns you can define your migrations centrally and they will be applied once and only
once to your database.

This helps when your production database has limited accessibility from your developer
machines and you want to apply the migrations from a server next to your production code.
In case you are setting up your database from scratch for e.g. development purpose you can
rely on your schema to be up-to-date with your production environment because you are
keeping your original schema along with your migrations in a central repository.

## How to migrate a database schema
When we are speaking of changes to your schema, these should always add new definitions and
never change existing definitions. In case you want to change existing data to a new format
you will have to create a new schema and transact your existing data transformed again. A
good intro to this topic [can be found here](https://docs.datomic.com/cloud/schema/schema-change.html).

## Transaction-functions
Your transaction functions need to be on your classpath to be called and they need to take
one argument, the connection to your database. Each function needs to return a vector of
transactions so that they can be applied during migration.

Please be aware that with transaction-functions you will create transactions that need to be
held in memory. Very large migrations might exceed your memory.

## Norms?
Like [conformity for Datomic](https://github.com/avescodes/conformity) we are using the term
norm for our tool. You can use it to declare expectations about the state of your database
and enforce those idempotently without repeatedly transacting schema. These expectations
can be the form of your schema, data in a certain format or pre-transacted data for e.g.
a development database.

## Migration folder
Preferably create a folder in your project resources called `migrations`. You can however
use any folder you like even outside your resources. If you don't want to package the
migrations into a jar you can just run the migration-functions with a path as string passed.
In your migration-folder you store your migration-files. Be aware that your chosen
migration-folder will include all subfolders for reading the migrations. Don't store
other files in your migration-folder besides your migrations!

## How to migrate

1. Create a folder of your choice, for now let's call it `migrations`. In this folder you
create a new file with an edn-extension like `001-my-first-norm.edn`. Preferably you name the
file beginning with a number. Please be aware that the name of your file will be the id of
your norm. Taking into account that you might create some more norms in the future
your migration. Taking into account that you might create some more migrations in the future
you should left-pad the names with zeros to keep a proper sorting. Keep in mind that your
norms are transacted sorted after your chosen ids one after another. Spaces will be replaced
with dashes to compose the id.
migrations are transacted sorted after your chosen ids one after another. Spaces will be
replaced with dashes to compose the id.

2. Write the transactions itself into your newly created file. The content of the file needs
to be an edn-map with one or both of the keys `:tx-data` and `tx-fn`. `:tx-data` is just
Expand All @@ -36,17 +69,17 @@ Example of a migration:
:tx-fn my-transactions.my-project/my-first-tx-fn}
```

3. When you are sufficiently confident that your norm will work you usually want to store
it into some kind of version control system. To avoid conflicts with fellow colleagues we
3. When you are sufficiently confident that your migrations will work you usually want to store
it in some kind of version control system. To avoid conflicts with fellow colleagues we
implemented a security net. Run the function `update-checksums!` from the `datahike.norm.norm`
namespace to create or update a `checksums.edn` file inside your norms-folder. This file
namespace to create or update a `checksums.edn` file inside your migrations-folder. This file
contains the names and checksums of your migration-files. In case a colleague of yours
checked in a migration that you have not been aware of, your VCS should avoid merging the
conflicting `checksums.edn` file.

4. Run the `datahike.norm.norm/ensure-norms!` function to apply your norms. For each norm
that is already applied there will be a `:tx/norm` attribute stored with the id of your
norm so it will not be applied twice.

Be aware that your chosen norm-folder will include all subfolders for reading the norms.
Don't store other files in your norm-folder besides your norms!
4. To apply your migrations you most likely want to package the migrations into a jar together
with datahike and a piece of code that actually runs your migrations and run it on a server.
You should check the correctness of the checksums with `datahike.norm.norm/verify-checksums`
and finally run the `datahike.norm.norm/ensure-norms!` function to apply your migrations. For
each migration that is already applied there will be a `:tx/norm` attribute stored with the
id of your migration so it will not be applied twice.
44 changes: 30 additions & 14 deletions src/datahike/norm/norm.clj
Original file line number Diff line number Diff line change
Expand Up @@ -128,12 +128,6 @@
hp/hash->str)))
{})))

(defn- verify-checksums [checksums checksums-edn file-or-resource]
(let [edn-content (-> (read-edn-file checksums-edn file-or-resource) first)
diff (data/diff checksums edn-content)]
(when-not (every? nil? (butlast diff))
(dt/raise "Deviation of the checksums found. Migration aborted." {:diff diff}))))

(s/def ::tx-data vector?)
(s/def ::tx-fn symbol?)
(s/def ::norm-map (s/keys :opt-un [::tx-data ::tx-fn]))
Expand Down Expand Up @@ -162,27 +156,49 @@
((var-get (requiring-resolve tx-fn)) conn)))})
(log/info "Done"))))))

(defn- diff-checksums [checksums edn-content]
(let [diff (data/diff checksums edn-content)]
(when-not (every? nil? (butlast diff))
(dt/raise "Deviation of the checksums found. Migration aborted." {:diff diff}))))

(defmulti verify-checksums
(fn [file-or-resource] (type file-or-resource)))

(defmethod verify-checksums File [file]
(let [norm-list (-> (retrieve-file-list file)
filter-file-list
(read-norm-files file))
edn-content (-> (io/file (io/file file) checksums-file)
(read-edn-file file)
first)]
(diff-checksums (compute-checksums norm-list)
edn-content)))

(defmethod verify-checksums URL [resource]
(let [file-list (retrieve-file-list resource)
norm-list (-> (filter-file-list file-list)
(read-norm-files resource))
edn-content (-> (->> file-list
(filter #(-> (.getName %) (string/ends-with? checksums-file)))
first)
(read-edn-file resource)
first)]
(diff-checksums (compute-checksums norm-list)
edn-content)))

(defmulti ^:private ensure-norms
(fn [_conn file-or-resource] (type file-or-resource)))

(defmethod ^:private ensure-norms File [conn file]
(let [norm-list (-> (retrieve-file-list file)
filter-file-list
(read-norm-files file))]
(verify-checksums (compute-checksums norm-list)
(io/file (io/file file) checksums-file)
file)
(transact-norms conn norm-list)))

(defmethod ^:private ensure-norms URL [conn resource]
(let [file-list (retrieve-file-list resource)
norm-list (-> (filter-file-list file-list)
(read-norm-files resource))]
(verify-checksums (compute-checksums norm-list)
(->> file-list
(filter #(-> (.getName %) (string/ends-with? checksums-file)))
first)
resource)
(transact-norms conn norm-list)))

(defn ensure-norms!
Expand Down
8 changes: 7 additions & 1 deletion test/datahike/norm/norm_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,14 @@
[clojure.string :as string]
[clojure.java.io :as io]
[datahike.api :as d]
[datahike.norm.norm :as sut]
[datahike.norm.norm :as sut :refer [verify-checksums]]
[datahike.test.utils :as tu]))

(def ensure-norms #'sut/ensure-norms)

(deftest simple-test
(let [conn (tu/setup-db {} true)
_ (verify-checksums (io/file "test/datahike/norm/resources/simple-test"))
_ (ensure-norms conn (io/file "test/datahike/norm/resources/simple-test"))
schema (d/schema (d/db conn))]
(is (= #:db{:valueType :db.type/string, :cardinality :db.cardinality/one, :doc "Place of occupation", :ident :character/place-of-occupation}
Expand All @@ -33,9 +34,11 @@

(deftest tx-fn-test
(let [conn (tu/setup-db {} true)
_ (verify-checksums (io/file "test/datahike/norm/resources/tx-fn-test/first"))
_ (ensure-norms conn (io/file "test/datahike/norm/resources/tx-fn-test/first"))
_ (d/transact conn {:tx-data [{:character/place-of-occupation "SPRINGFIELD ELEMENTARY SCHOOL"}
{:character/place-of-occupation "SPRINGFIELD NUCLEAR POWER PLANT"}]})
_ (verify-checksums (io/file "test/datahike/norm/resources/tx-fn-test/second"))
_ (ensure-norms conn (io/file "test/datahike/norm/resources/tx-fn-test/second"))]
(is (= #{["springfield elementary school"] ["springfield nuclear power plant"]}
(d/q '[:find ?v
Expand All @@ -60,12 +63,14 @@

(deftest tx-data-and-tx-fn-test
(let [conn (tu/setup-db {} true)
_ (verify-checksums (io/file "test/datahike/norm/resources/tx-data-and-tx-fn-test/first"))
_ (ensure-norms conn (io/file "test/datahike/norm/resources/tx-data-and-tx-fn-test/first"))
_ (d/transact conn {:tx-data [{:character/name "Homer Simpson"}
{:character/name "Marge Simpson"}
{:character/name "Bart Simpson"}
{:character/name "Lisa Simpson"}
{:character/name "Maggie Simpson"}]})
_ (verify-checksums (io/file "test/datahike/norm/resources/tx-data-and-tx-fn-test/second"))
_ (ensure-norms conn (io/file "test/datahike/norm/resources/tx-data-and-tx-fn-test/second"))
margehomer (d/q '[:find [?e ...]
:where
Expand Down Expand Up @@ -100,6 +105,7 @@

(deftest naming-and-sorting-test
(let [conn (tu/setup-db {} true)
_ (verify-checksums (io/file "test/datahike/norm/resources/naming-and-sorting-test"))
_ (sut/ensure-norms! conn (io/file "test/datahike/norm/resources/naming-and-sorting-test"))
lisabart (d/q '[:find [?e ...]
:where
Expand Down

0 comments on commit ddadd46

Please sign in to comment.