From b0264e61ecfded72c6fdc71f0b558e574776e224 Mon Sep 17 00:00:00 2001 From: Steve Brudenell Date: Tue, 3 Sep 2024 07:13:43 -0800 Subject: [PATCH] document object storage scheme --- README.md | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/README.md b/README.md index de0ee55..f7acc04 100644 --- a/README.md +++ b/README.md @@ -319,6 +319,67 @@ naming pattern cannot be configured. For cloud backups, `btrfs2s3` encodes metadata about the backup in the filename. This is so all metadata can be parsed from the result of one `ListObjectsV2` call. +# Object storage scheme + +The content of each backup object is simply the output of `btrfs send [-p]` (plus +`pipe_through`). + +[**Upcoming change**](https://github.com/sbrudenell/btrfs2s3/issues/54): To support +backups larger than the provider's maximum object size, we will consider a backup to be +split across multiple objects. The full backup will be the result of concatenating the +splits. + +We use the *file name* (aka object key) to store metadata about each backup. We do this +by appending specialized suffixes to a base name. + +The current metadata scheme looks like this (whitespace and line continuations added for +clarity): + +``` + \ # user-chosen base name, ignored + .ctim \ # ctime of the snapshot + .ctid \ # ctransid of the snapshot + .uuid \ # uuid of the snapshot + .sndp \ # uuid of the incremental parent + .prnt \ # uuid of the source subvol + .mdvn \ # currently always 1 + .seqn # currently must be 0 +``` + +Metadata suffixes may appear in any order. Unrecognized suffixes are ignored, so +suffixes like `.gz` may be added as desired. Metadata suffixes are designed such that +the values never contain a period, and such that they are unlikely to collide with any +user-chosen base names or suffixes. + +For `ctime`, we use an ISO 8601 timestamp including timezone. The intent is to make it +easier to manually browse backups by filename if necessary. + +For full backups, `send_parent_uuid` is the zero UUID. + +`ctime`, `ctransid` and the `uuid`s are properties of the btrfs subvolume, generated by +kernel code. `btrfs2s3` does not generate them. + +Note that while metadata names are *typically* shorter than the common Linux filename +limit of 255 bytes, this is *not* currently a design goal. Our only goal is that names +be shorter than S3's limit of 1024 bytes. + +An example list of names describing a backup tree might look like this: + +``` +my_subvol.ctim2006-01-01T00:00:00+00:00.ctid12345.uuid3fd11d8e-8110-4cd0-b85c-bae3dda86a3d.sndp00000000-0000-0000-0000-000000000000.prnt9d9d3bcb-4b62-46a3-b6e2-678eeb24f54e.mdvn1.seqn0.gz +my_subvol.ctim2006-01-02T00:00:00+00:00.ctid12350.uuid721df607-3296-4f38-970e-630be8f36598.sndp3fd11d8e-8110-4cd0-b85c-bae3dda86a3d.prnt9d9d3bcb-4b62-46a3-b6e2-678eeb24f54e.mdvn1.seqn0.gz +my_subvol.ctim2006-01-03T00:00:00+00:00.ctid12360.uuid5e8bb815-f8ce-43c5-95e0-08ace3c21459.sndp3fd11d8e-8110-4cd0-b85c-bae3dda86a3d.prnt9d9d3bcb-4b62-46a3-b6e2-678eeb24f54e.mdvn1.seqn0.gz +``` + +In this example: + +- There is one full backup on 2006-01-01 +- The other backups on 2006-01-02 and 2006-01-03 are incremental backups, because their + send-parent UUID is the UUID of the full backup +- The parent UUID of each is `9d9d3bcb-4b62-46a3-b6e2-678eeb24f54e`. This is the UUID of + the original mutable subvolume +- The base name `my_subvol` and suffix `.gz` are ignored by `btrfs2s3` + # Cloud storage costs Cloud storage providers will charge a *storage cost*, which is a fixed amount per byte