Skip to content

Commit

Permalink
document object storage scheme
Browse files Browse the repository at this point in the history
  • Loading branch information
sbrudenell committed Sep 3, 2024
1 parent 4973d42 commit b0264e6
Showing 1 changed file with 61 additions and 0 deletions.
61 changes: 61 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,67 @@ naming pattern cannot be configured.
For cloud backups, `btrfs2s3` encodes metadata about the backup in the filename. This is
so all metadata can be parsed from the result of one `ListObjectsV2` call.

# Object storage scheme

The content of each backup object is simply the output of `btrfs send [-p]` (plus
`pipe_through`).

[**Upcoming change**](https://github.com/sbrudenell/btrfs2s3/issues/54): To support
backups larger than the provider's maximum object size, we will consider a backup to be
split across multiple objects. The full backup will be the result of concatenating the
splits.

We use the *file name* (aka object key) to store metadata about each backup. We do this
by appending specialized suffixes to a base name.

The current metadata scheme looks like this (whitespace and line continuations added for
clarity):

```
<base_name> \ # user-chosen base name, ignored
.ctim<ctime> \ # ctime of the snapshot
.ctid<ctransid> \ # ctransid of the snapshot
.uuid<uuid> \ # uuid of the snapshot
.sndp<send_parent_uuid> \ # uuid of the incremental parent
.prnt<parent_uuid> \ # uuid of the source subvol
.mdvn<metadata_version> \ # currently always 1
.seqn<sequence_number> # currently must be 0
```

Metadata suffixes may appear in any order. Unrecognized suffixes are ignored, so
suffixes like `.gz` may be added as desired. Metadata suffixes are designed such that
the values never contain a period, and such that they are unlikely to collide with any
user-chosen base names or suffixes.

For `ctime`, we use an ISO 8601 timestamp including timezone. The intent is to make it
easier to manually browse backups by filename if necessary.

For full backups, `send_parent_uuid` is the zero UUID.

`ctime`, `ctransid` and the `uuid`s are properties of the btrfs subvolume, generated by
kernel code. `btrfs2s3` does not generate them.

Note that while metadata names are *typically* shorter than the common Linux filename
limit of 255 bytes, this is *not* currently a design goal. Our only goal is that names
be shorter than S3's limit of 1024 bytes.

An example list of names describing a backup tree might look like this:

```
my_subvol.ctim2006-01-01T00:00:00+00:00.ctid12345.uuid3fd11d8e-8110-4cd0-b85c-bae3dda86a3d.sndp00000000-0000-0000-0000-000000000000.prnt9d9d3bcb-4b62-46a3-b6e2-678eeb24f54e.mdvn1.seqn0.gz
my_subvol.ctim2006-01-02T00:00:00+00:00.ctid12350.uuid721df607-3296-4f38-970e-630be8f36598.sndp3fd11d8e-8110-4cd0-b85c-bae3dda86a3d.prnt9d9d3bcb-4b62-46a3-b6e2-678eeb24f54e.mdvn1.seqn0.gz
my_subvol.ctim2006-01-03T00:00:00+00:00.ctid12360.uuid5e8bb815-f8ce-43c5-95e0-08ace3c21459.sndp3fd11d8e-8110-4cd0-b85c-bae3dda86a3d.prnt9d9d3bcb-4b62-46a3-b6e2-678eeb24f54e.mdvn1.seqn0.gz
```

In this example:

- There is one full backup on 2006-01-01
- The other backups on 2006-01-02 and 2006-01-03 are incremental backups, because their
send-parent UUID is the UUID of the full backup
- The parent UUID of each is `9d9d3bcb-4b62-46a3-b6e2-678eeb24f54e`. This is the UUID of
the original mutable subvolume
- The base name `my_subvol` and suffix `.gz` are ignored by `btrfs2s3`

# Cloud storage costs

Cloud storage providers will charge a *storage cost*, which is a fixed amount per byte
Expand Down

0 comments on commit b0264e6

Please sign in to comment.