Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reintroduce find subcommand and unify rev spec #62

Merged
merged 2 commits into from
Feb 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,6 @@ jobs:
- name: Install Nix
uses: DeterminateSystems/nix-installer-action@main
- uses: DeterminateSystems/magic-nix-cache-action@main
- name: Check Nixpkgs inputs
uses: DeterminateSystems/flake-checker-action@main
with:
fail-mode: true

Expand Down
3 changes: 3 additions & 0 deletions changelog.d/20231207_235814_jb_reintroduce_find.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.. A new scriv changelog fragment.

- reintroduce `find` subcommand
3 changes: 3 additions & 0 deletions changelog.d/20231208_201510_jb_reintroduce_find.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.. A new scriv changelog fragment.

- Unify and extend revision spec syntax
46 changes: 39 additions & 7 deletions doc/man-backy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -141,16 +141,48 @@ Subcommand-specific options
Valid for **scheduler** and **check** subcommands.

**-r** *REVISION*
Selects a revision other than the last revision.
Selects one or more revisions other than the default.

Revisions can be specified in the following ways:
A single revision can be specified in the following ways:

* A full revision ID as printed with **backy status**. ID prefixes are OK as
long as they are unique.
* A full revision ID as printed with **backy status**.
* A relative revision count: 0 is the last revision, 1 the one before, ...
* The key word **last** or **latest** as alias for the last revision.
* A revision tag. If several revisions with the given tag exist, the newest
one will be given.
* The key word **last** or **latest** is an alias for the last revision.
* The key word **first** is an alias for the first revision.
* The function **first** followed by a revision specifier in parentheses.
This returns the first value in the list, not the earliest by date.
* The function **last** followed by a revision specifier in parentheses.
This returns the last value in the list, not the latest by date.

Multiple revisions can be specified in the following ways:

* A multi revision specifier enclosed in parentheses.
* The function **not** followed by a revision specifier in parentheses.
This returns every revision which is not in the list.
Ordered by date, oldest first.
* The function **reverse** followed by a revision specifier in parentheses.
This returns the list in reversed order.
* The key word **all** is an alias for all revisions.
Ordered by date, oldest first.
* The key word **clean** is an alias for all clean/completed revisions.
Ordered by date, oldest first.
* A Trust state with the **trust:** prefix: Selects all revisions with this
Trust state. Ordered by date, oldest first.
* A tag with the **tag:** prefix. Selects all revisions with this tag.
Ordered by date, oldest first.
* An inclusive range using two single revision specifiers separated with two
dots. The singe revision specifiers may be omitted, in which case the
**first** and/or **last** revision is assumed.
In addition to the single revision specifiers iso dates are also
supported (YYYY-MM-DD[THH:MM:SS[.ffffff]+HH:MM[:SS[.ffffff]]). The time
defaults to 00:00 and the timezone to the local timezone. The result is
ordered by date, oldest first, regardless of the provided argument order.
* An intersection using an ampersand separated list of all the above
specifiers. The order will be preserved.
* A comma separated list of all the above specifiers. The order will be
preserved and duplicates removed.

All subcommands except restore accept multiple revisions.
dhnasa marked this conversation as resolved.
Show resolved Hide resolved

Valid for **find** and **restore** subcommands.

Expand Down
212 changes: 153 additions & 59 deletions src/backy/backup.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,26 @@
import glob
import os
import os.path as p
import re
import subprocess
import time
from enum import Enum
from typing import IO, Optional, Type
from math import ceil, floor
from typing import IO, List, Optional, Type

import tzlocal
import yaml
from structlog.stdlib import BoundLogger

import backy.backends.chunked
from backy.utils import min_date
from backy.utils import (
duplicates,
list_get,
list_rindex,
list_split,
min_date,
unique,
)

from .backends import BackendException, BackyBackend
from .backends.chunked import ChunkedFileBackend
Expand Down Expand Up @@ -56,7 +66,9 @@ def locked(target=None, mode=None):
raise ValueError("Unknown lock mode '{}'".format(mode))

def wrap(f):
def locked_function(self, *args, **kw):
def locked_function(self, *args, skip_lock=False, **kw):
if skip_lock:
return f(self, *args, **kw)
if target in self._lock_fds:
raise RuntimeError("Bug: Locking is not re-entrant.")
target_path = p.join(self.path, target)
Expand Down Expand Up @@ -201,13 +213,13 @@ def _clean(self):
revision.remove()

@locked(target=".backup", mode="exclusive")
def forget_revision(self, revision):
r = self.find(revision)
r.remove()
def forget(self, revision: str):
for r in self.find_revisions(revision):
r.remove()

@locked(target=".backup", mode="exclusive")
@locked(target=".purge", mode="shared")
def backup(self, tags, force=False):
def backup(self, tags: set[str], force=False):
if not force:
missing_tags = (
filter_schedule_tags(tags) - self.schedule.schedule.keys()
Expand Down Expand Up @@ -250,7 +262,7 @@ def backup(self, tags, force=False):
except BackendException:
self.log.exception("backend-error-distrust-all")
verified = False
self.distrust_range()
self.distrust("all", skip_lock=True)
if not verified:
self.log.error(
"verification-failed",
Expand Down Expand Up @@ -282,44 +294,16 @@ def backup(self, tags, force=False):
break

@locked(target=".backup", mode="exclusive")
def distrust(
self,
revision=None,
from_: Optional[datetime.date] = None,
until: Optional[datetime.date] = None,
):
if revision:
r = self.find(revision)
r.distrust()
r.write_info()
else:
self.distrust_range(from_, until)

def distrust_range(
self,
from_: Optional[datetime.date] = None,
until: Optional[datetime.date] = None,
):
for r in self.clean_history:
if from_ and r.timestamp.date() < from_:
continue
if until and r.timestamp.date() > until:
continue
def distrust(self, revision: str):
for r in self.find_revisions(revision):
dhnasa marked this conversation as resolved.
Show resolved Hide resolved
r.distrust()
r.write_info()

@locked(target=".purge", mode="shared")
def verify(self, revision=None):
if revision:
r = self.find(revision)
def verify(self, revision: str):
for r in self.find_revisions(revision):
backend = self.backend_factory(r, self.log)
backend.verify()
else:
for r in list(self.clean_history):
if r.trust != Trust.DISTRUSTED:
continue
backend = self.backend_factory(r, self.log)
backend.verify()

@locked(target=".purge", mode="exclusive")
def purge(self):
Expand Down Expand Up @@ -498,34 +482,131 @@ def upgrade(self):
######################
# Looking up revisions

def last_by_tag(self):
def last_by_tag(self) -> dict[str, datetime.datetime]:
"""Return a dictionary showing the last time each tag was
backed up.

Tags that have never been backed up won't show up here.

"""
last_times = {}
last_times: dict[str, datetime.datetime] = {}
for revision in self.clean_history:
for tag in revision.tags:
last_times.setdefault(tag, min_date())
last_times[tag] = max([last_times[tag], revision.timestamp])
return last_times

def find_revisions(self, spec):
def find_revisions(
dhnasa marked this conversation as resolved.
Show resolved Hide resolved
self, spec: str | List[str | Revision | List[Revision]]
) -> List[Revision]:
"""Get a sorted list of revisions, oldest first, that match the given
specification.
"""
if isinstance(spec, str) and spec.startswith("tag:"):
tag = spec.replace("tag:", "")
result = [r for r in self.history if tag in r.tags]
elif spec == "all":
result = self.history[:]

tokens: List[str | Revision | List[Revision]]
if isinstance(spec, str):
tokens = [
t.strip()
for t in re.split(r"(\(|\)|,|&|\.\.)", spec)
if t.strip()
]
else:
result = [self.find(spec)]
return result
tokens = spec
if "(" in tokens and ")" in tokens:
i = list_rindex(tokens, "(")
j = tokens.index(")", i)
prev, middle, next = tokens[:i], tokens[i + 1 : j], tokens[j + 1 :]

functions = {
"first": lambda x: x[0],
"last": lambda x: x[-1],
"not": lambda x: [r for r in self.history if r not in x],
"reverse": lambda x: list(reversed(x)),
}
if prev and isinstance(prev[-1], str) and prev[-1] in functions:
return self.find_revisions(
prev[:-1]
+ [functions[prev[-1]](self.find_revisions(middle))]
+ next
)
return self.find_revisions(
prev + [self.find_revisions(middle)] + next
)
elif "," in tokens:
i = tokens.index(",")
return unique(
self.find_revisions(tokens[:i])
+ self.find_revisions(tokens[i + 1 :])
)
elif "&" in tokens:
i = tokens.index("&")
return duplicates(
self.find_revisions(tokens[:i]),
self.find_revisions(tokens[i + 1 :]),
)
elif ".." in tokens:
_a, _b = list_split(tokens, "..")
assert len(_a) <= 1 and len(_b) <= 1
a = self.index_by_token(list_get(_a, 0, "first"))
b = self.index_by_token(list_get(_b, 0, "last"))
return self.history[ceil(min(a, b)) : floor(max(a, b)) + 1]
assert len(tokens) == 1
token = tokens[0]
if isinstance(token, Revision):
return [token]
elif isinstance(token, list):
return token
if token.startswith("tag:"):
tag = token.removeprefix("tag:")
return [r for r in self.history if tag in r.tags]
elif token.startswith("trust:"):
trust = Trust(token.removeprefix("trust:").lower())
return [r for r in self.history if trust == r.trust]
elif token == "all":
return self.history[:]
elif token == "clean":
return self.clean_history[:]
else:
return [self.find(token)]

def index_by_token(self, spec: str | Revision | List[Revision]):
assert not isinstance(
spec, list
), "can only index a single revision specifier"
if isinstance(spec, str):
return self.index_by_date(spec) or self.history.index(
self.find(spec)
)
else:
return self.history.index(spec)

def find_by_number(self, spec):
def index_by_date(self, spec: str) -> Optional[float]:
"""Return index of revision matched by datetime.
Index may be fractional if there is no exact datetime match.
Index range: [-0.5, len+0.5]
"""
try:
date = datetime.datetime.fromisoformat(spec)
date = date.replace(tzinfo=date.tzinfo or tzlocal.get_localzone())
l = list_get(
[i for i, r in enumerate(self.history) if r.timestamp <= date],
-1,
-1,
)
r = list_get(
[i for i, r in enumerate(self.history) if r.timestamp >= date],
0,
len(self.history),
)
print(spec, l, r)
assert (
0 <= r - l <= 1
), "can not index with date if multiple revision have the same timestamp"
return (l + r) / 2.0
except ValueError:
return None

def find_by_number(self, _spec: str) -> Revision:
"""Returns revision by relative number.

0 is the newest,
Expand All @@ -535,22 +616,23 @@ def find_by_number(self, spec):

Raises IndexError or ValueError if no revision is found.
"""
spec = int(spec)
spec = int(_spec)
if spec < 0:
raise KeyError("Integer revisions must be positive")
return self.history[-spec - 1]

def find_by_tag(self, spec):
def find_by_tag(self, spec: str) -> Revision:
"""Returns the latest revision matching a given tag.

Raises IndexError or ValueError if no revision is found.
"""
if spec in ["last", "latest"]:
return self.history[-1]
matching = [r for r in self.history if spec in r.tags]
return max((r.timestamp, r) for r in matching)[1]
if spec == "first":
return self.history[0]
raise ValueError()

def find_by_uuid(self, spec):
def find_by_uuid(self, spec: str) -> Revision:
"""Returns revision matched by UUID.

Raises IndexError if no revision is found.
Expand All @@ -560,16 +642,28 @@ def find_by_uuid(self, spec):
except KeyError:
raise IndexError()

def find(self, spec) -> Revision:
def find_by_function(self, spec: str):
m = re.fullmatch(r"(\w+)\(.+\)", spec)
if m and m.group(1) in ["first", "last"]:
return self.find_revisions(m.group(0))[0]
raise ValueError()

def find(self, spec: str) -> Revision:
"""Flexible revision search.

Locates a revision by relative number, by tag, or by uuid.

"""
if spec is None or spec == "" or not self.history:
spec = spec.strip()
if spec == "" or not self.history:
raise KeyError(spec)

for find in (self.find_by_number, self.find_by_uuid, self.find_by_tag):
for find in (
self.find_by_number,
self.find_by_uuid,
self.find_by_tag,
self.find_by_function,
):
try:
return find(spec)
except (ValueError, IndexError):
Expand Down
Loading
Loading