Skip to content

Commit

Permalink
Merge branch 'master' into docs-refresh
Browse files Browse the repository at this point in the history
  • Loading branch information
oetiker authored Jan 16, 2024
2 parents 168e63c + fd8c6a8 commit eadd3ec
Show file tree
Hide file tree
Showing 6 changed files with 179 additions and 30 deletions.
1 change: 1 addition & 0 deletions .github/workflows/spelling/expect.txt
Original file line number Diff line number Diff line change
Expand Up @@ -339,6 +339,7 @@ manpath
manualsnap
mariadb
mariadblock
Mbuf
mbuffer
mbuffersize
MConfig
Expand Down
21 changes: 21 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,27 @@ and the I/O speeds of the storage and networking involved. As a rule of thumb,
let it absorb at least a minute of I/O, so while one side of the ZFS dialog
is deeply thinking, another can do its work.

> **_NOTE:_** Due to backwards-compatibility considerations, the legacy
> `--mbuffer=...` setting applies by default to all destination datasets
> (and to sender, in case of `--mbuffer=/path/to/mbuffer:port` variant).
> This might work if needed programs are all found in `PATH` by the same
> short name, but fails miserably if custom full path names are required
> on different systems.
>
> To avoid this limitation, ZnapZend now allows to specify custom path
> and buffer size settings individually for each source and destination
> dataset in each backup/retention schedule configuration (using the
> `znapzendzetup` program or `org.znapzend:src_mbuffer` etc. ZFS dataset
> properties directly). The legacy configuration properties would now be
> used as fallback defaults, and may emit warnings whenever they are
> applied as such.
>
> With this feature in place, the sender may have the only `mbuffer`
> running, without requiring one on the receiver (e.g. to limit impact
> to RAM usage on the backup server). You may also run an mbuffer on
> each side of the SSH tunnel, if networking latency is random and
> carries a considerable impact.
The remote system does not need anything other than ZFS functionality, an
SSH server, a user account with prepared SSH key based log-in (optionally
an unprivileged one with `zfs allow` settings on a particular target dataset
Expand Down
75 changes: 64 additions & 11 deletions bin/znapzendzetup
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,16 @@ sub parseArguments {
#option must be a dataset or invalid
$state eq 'src' && do {
$backupSet{src} = $_;
$state = 'srcMbuf';
next;
};
$state eq 'srcMbuf' && do {
$backupSet{src_mbuffer} = $_;
$state = 'srcMbufSize';
next;
};
$state eq 'srcMbufSize' && do {
$backupSet{src_mbuffer_size} = $_;
$state = '';
next;
};
Expand All @@ -98,10 +108,20 @@ sub parseArguments {
#catch post-command if any
$state eq 'pst' && do {
$backupSet{'dst_' . $key . '_pstcmd'} = $_;
$state = 'dstMbuf';
next;
};
$state eq 'dstMbuf' && do {
$backupSet{'dst_' . $key . '_mbuffer'} = $_;
$state = 'dstMbufSize';
next;
};
$state eq 'dstMbufSize' && do {
$backupSet{'dst_' . $key . '_mbuffer_size'} = $_;
$state = '';
next;
};
die "ERROR: dont know what to do with $_. check the syntax\n";
die "ERROR: don't know what to do with $_. check the syntax\n";
}

#check if we have a valid source as this is crucial
Expand Down Expand Up @@ -539,9 +559,10 @@ and where 'command' and its unique options is one of the following:
[--post-snap-command=<command>] \
[--tsformat=<format>] --donotask \
[--send-delay=<time>] \
SRC plan dataset \
SRC plan dataset [src_mbuffer_path [src_mbuffer_size]] \
[ DST[:key] plan [[user@]host:]dataset
[pre-send-command] [post-send-command] ]
[pre-send-command] [post-send-command] \
[dst_mbuffer_path[:port]] [dst_mbuffer_size] ]
NOTE: If you specify [user@]host:dataset for remote replication
over SSH, make use of ~/.ssh/config for any advanced options
Expand All @@ -555,9 +576,10 @@ and where 'command' and its unique options is one of the following:
[--post-snap-command=<command>|off] \
[--tsformat=<format>] --donotask \
[--send-delay=<time>] \
SRC [plan] dataset \
SRC [plan] dataset [src_mbuffer_path [src_mbuffer_size]] \
[ DST:key [plan] [dataset] \
[pre-send-command|off] [post-send-command|off] ]
[pre-send-command|off] [post-send-command|off] \
[dst_mbuffer_path[:port]|off] [dst_mbuffer_size] ]
edit <src_dataset>
Expand Down Expand Up @@ -701,14 +723,39 @@ separator.
=item B<--mbuffer>=I</usr/bin/mbuffer>
Specify the path to your copy of the mbuffer utility.
DEPRECATED: Specify the path to your copy of the mbuffer utility.
NOTE: with this option, the same path would be used for all remote
destinations - this can misfire if they run different operating systems.
It is currently recommended to define individual B<dst_mbuffer_path>
options for each separate destination in each dataset configuration.
The B<--mbuffer> value would be used as a fallback default for those.
Per legacy-default behavior, the mbuffer program was not used by the
sender (unless using a dedicated port, see below). Nowadays it is
possible to specify it instead of (or in addition to) a destination
side mbuffer, using the B<src_mbuffer_path> in each source dataset
configuration.
=item B<--mbuffer>=I</usr/bin/mbuffer:31337>
Specify the path to your copy of the mbuffer utility and the port used
on the destination. Caution: znapzend will send the data directly
from source mbuffer to destination mbuffer, thus data stream is B<not>
encrypted.
DEPRECATED: Specify the path to your copy of the mbuffer utility and
the port used on the destination. Caution: znapzend will use SSH to
set up the remote mbuffer receiver, but will send the snapshot data
stream directly from source mbuffer to destination mbuffer. In other
words, the data stream is B<not> encrypted. Use this only in a trusted
LAN or over VPN, where you can safely avoid the overheads of an SSH
tunnel.
NOTE: with this option, the same path would be used for all remote
destinations as well as the source system - this can misfire if they
run different operating systems.
It is currently recommended to define individual B<*_mbuffer_path>
options for each source and each separate destination in each dataset
configuration. The B<--mbuffer> value would be used as a fallback
default for those (with only path component for the source).
=item B<--mbuffersize>=I<number>{B<b>|B<k>|B<M>|B<G>}
Expand All @@ -723,6 +770,10 @@ To specify a mbuffer size of 100MB:
If not set, the buffer size defaults to 1GB.
It is currently suggested to define individual B<mbuffer_size> options for
each source and each separate destination in each dataset configuration.
The B<--mbuffer-size> value would be used as a fallback default for those.
=item B<--donotask>
Apply changes immediately. Without being asked if the config is as you
Expand Down Expand Up @@ -820,11 +871,13 @@ create a complex backup task
--pre-snap-command="/bin/sh /usr/local/bin/lock_flush_db.sh" \
--post-snap-command="/bin/sh /usr/local/bin/unlock_db.sh" \
SRC '7d=>1h,30d=>4h,90d=>1d' tank/home \
"/usr/bin/mbuffer" "128M" \
DST:a '7d=>1h,30d=>4h,90d=>1d,1y=>1w,10y=>1month' backup/home \
DST:b '7d=>1h,30d=>4h,90d=>1d,1y=>1w,10y=>1month' \
root@bserv:backup/home \
"/root/znapzend.sh dst_b pool on" \
"/root/znapzend.sh dst_b pool off"
"/root/znapzend.sh dst_b pool off" \
"/opt/bin64/mbuffer" "4G"
copy the setup from one fileset to another
Expand Down
6 changes: 4 additions & 2 deletions lib/ZnapZend.pm
Original file line number Diff line number Diff line change
Expand Up @@ -597,7 +597,8 @@ my $sendRecvCleanup = sub {
$lastSnapshotToSee = ${$srcSnapshots}[$seenX];
}
$self->zZfs->sendRecvSnapshots($srcDataSet, $dstDataSet, $dst,
$backupSet->{mbuffer}, $backupSet->{mbuffer_size},
$backupSet->{src_mbuffer}, $backupSet->{src_mbuffer_size},
$backupSet->{"dst_$key" . '_mbuffer'}, $backupSet->{"dst_$key" . '_mbuffer_size'},
$backupSet->{snapSendFilter}, $lastSnapshotToSee,
( $backupSet->{"dst_$key" . '_justCreated'} ? 1 : ($doPromote > 1 ? $doPromote : undef ) )
);
Expand Down Expand Up @@ -626,7 +627,8 @@ my $sendRecvCleanup = sub {
# Note this can fail if we forbidDestRollback and there are
# snapshots or data on dst newer than the last common snap.
$self->zZfs->sendRecvSnapshots($srcDataSet, $dstDataSet, $dst,
$backupSet->{mbuffer}, $backupSet->{mbuffer_size},
$backupSet->{src_mbuffer}, $backupSet->{src_mbuffer_size},
$backupSet->{"dst_$key" . '_mbuffer'}, $backupSet->{"dst_$key" . '_mbuffer_size'},
$backupSet->{snapSendFilter}, undef,
( $backupSet->{"dst_$key" . '_justCreated'} ? 1 : undef )
);
Expand Down
65 changes: 60 additions & 5 deletions lib/ZnapZend/Config.pm
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,45 @@ my $checkBackupSets = sub {
or die "ERROR: property $prop is not valid on dataset " . $backupSet->{src} . "\n";
}
}

# mbuffer properties not set for source? legacy behavior was to not use
# any on the sender, except when in port-to-port mode
if (!exists($backupSet->{src_mbuffer}) or !($backupSet->{src_mbuffer})) {
# Have *something* defined to avoid further exists() checks at least
$backupSet->{src_mbuffer} = undef;
if ($backupSet->{mbuffer}) {
if ($backupSet->{mbuffer} eq 'off') {
# Only use the setting for source if legacy "off" is set
$backupSet->{src_mbuffer} = $backupSet->{mbuffer};
$self->zLog->info("WARNING: property 'src_mbuffer' not set on backup for " . $backupSet->{src} . ", inheriting 'off' from legacy 'mbuffer'");
} else {
my ($mbuffer, $mbufferPort) = split /:/, $backupSet->{mbuffer}, 2;
#check if port is numeric
if ($mbufferPort &&
$mbufferPort =~ /^\d{1,5}$/ && int($mbufferPort) < 65535
) {
# Only use the setting for source program if the legacy
# "/path/to/mbuffer:port" is set (note we would use a
# port defined by each destination separately - maybe
# inherited from the legacy setting, maybe re-defined
# locally or even avoided for that destination link).
$backupSet->{src_mbuffer} = $mbuffer;
$self->zLog->info("WARNING: property 'src_mbuffer' not set on backup for " . $backupSet->{src} . ", inheriting path from legacy 'mbuffer': " . $backupSet->{src_mbuffer});
}
}
}
}
if ($backupSet->{src_mbuffer}) {
if (!($self->zfs->fileExistsAndExec($backupSet->{src_mbuffer}))) {
warn "*** WARNING: executable '$backupSet->{src_mbuffer}' does not exist on source system, will ignore\n\n";
$backupSet->{src_mbuffer} = undef;
}
}
if (!exists($backupSet->{src_mbuffer_size}) or !($backupSet->{src_mbuffer_size})) {
$backupSet->{src_mbuffer_size} = $backupSet->{mbuffer_size};
$self->zLog->info("WARNING: property 'src_mbuffer_size' not set on backup for " . $backupSet->{src} . ", inheriting from legacy 'mbuffer_size': " . $backupSet->{src_mbuffer_size}) if $backupSet->{src_mbuffer_size};
}

#check destination plans and datasets
for my $dst (grep { /^dst_[^_]+$/ } keys %$backupSet){
#store backup destination validity. will be checked where used
Expand All @@ -158,17 +197,33 @@ my $checkBackupSets = sub {

$backupSet->{$dst . '_plan'} = $self->$checkBackupPlan($backupSet->{$dst . '_plan'});

# mbuffer properties not set for destination? inherit the legacy default ones.
if (!exists($backupSet->{$dst . '_mbuffer'}) or !($backupSet->{$dst . '_mbuffer'})) {
###if ($backupSet->{mbuffer}) {
$backupSet->{$dst . '_mbuffer'} = $backupSet->{mbuffer};
### Do not preclude inheritance when legacy setting changes
###} else {
### $backupSet->{$dst . '_mbuffer'} = 'off';
###}
$self->zLog->info("WARNING: property '" . $dst . "_mbuffer' not set on backup for " . $backupSet->{src} . ", inheriting path[:port] from legacy 'mbuffer': " . $backupSet->{$dst . '_mbuffer'}) if $backupSet->{$dst . '_mbuffer'};
}
if (!exists($backupSet->{$dst . '_mbuffer_size'}) or !($backupSet->{$dst . '_mbuffer_size'})) {
$backupSet->{$dst . '_mbuffer_size'} = $backupSet->{mbuffer_size};
$self->zLog->info("WARNING: property '" . $dst . "_mbuffer_size' not set on backup for " . $backupSet->{src} . ", inheriting from legacy 'mbuffer_size': " . $backupSet->{$dst . '_mbuffer_size'}) if $backupSet->{$dst . '_mbuffer_size'};
}

# mbuffer property set? check if executable is available on remote host
if ($backupSet->{mbuffer} ne 'off'){
my ($mbuffer, $mbufferPort) = split /:/, $backupSet->{mbuffer}, 2;
if ($backupSet->{$dst . '_mbuffer'} ne 'off') {
my ($mbuffer, $mbufferPort) = split /:/, $backupSet->{$dst . '_mbuffer'}, 2;
my ($remote, $dataset) = $splitHostDataSet->($backupSet->{$dst});
my $file = ($remote ? "$remote:" : '') . $mbuffer;
$self->zfs->fileExistsAndExec($file)
or warn "*** WARNING: executable '$mbuffer' does not exist" . ($remote ? " on $remote\n\n" : "\n\n");
or warn "*** WARNING: executable '$mbuffer' does not exist on " . ($remote ? "remote $remote" : "local") . " system, zfs receive can fail\n\n";
# TOTHINK: Reset to 'off'/undef and ignore the validity checks below?

#check if mbuffer size is valid
$backupSet->{mbuffer_size} =~ /^\d+[bkMG%]?$/
or die "ERROR: mbuffer size '" . $backupSet->{mbuffer_size} . "' invalid\n";
$backupSet->{$dst . '_mbuffer_size'} =~ /^\d+[bkMG%]?$/
or die "ERROR: mbuffer size '" . $backupSet->{$dst . '_mbuffer_size'} . "' invalid\n";
#check if port is numeric
$mbufferPort && do {
$mbufferPort =~ /^\d{1,5}$/ && int($mbufferPort) < 65535
Expand Down
41 changes: 29 additions & 12 deletions lib/ZnapZend/ZFS.pm
Original file line number Diff line number Diff line change
Expand Up @@ -603,8 +603,10 @@ sub sendRecvSnapshots {
my $srcDataSet = shift;
my $dstDataSet = shift;
my $dstName = shift; # name of the znapzend policy => property prefix
my $mbuffer = shift;
my $mbufferSize = shift;
my $srcMbuffer = shift // 'off';
my $srcMbufferSize = shift // '1G'; # documented default for mbuffer_size
my $dstMbuffer = shift // 'off';
my $dstMbufferSize = shift // '1G';
my $snapFilter = shift // qr/.*/;

# Limit creation-ordered listing after registering this snapshot name,
Expand All @@ -631,7 +633,7 @@ sub sendRecvSnapshots {
push @sendOpt, '-w' if $self->sendRaw;
push @recvOpt, '-s' if $self->resume;
my $remote;
my $mbufferPort;
my $dstMbufferPort;

my $dstDataSetPath;
($remote, $dstDataSetPath) = $splitHostDataSet->($dstDataSet);
Expand Down Expand Up @@ -705,7 +707,7 @@ sub sendRecvSnapshots {
}
}

($mbuffer, $mbufferPort) = split /:/, $mbuffer, 2;
($dstMbuffer, $dstMbufferPort) = split /:/, $dstMbuffer, 2;

my @cmd;
if ($lastCommon){
Expand All @@ -715,12 +717,23 @@ sub sendRecvSnapshots {
@cmd = ([@{$self->priv}, 'zfs', 'send', @sendOpt, $lastSnapshot]);
}

#if mbuffer port is set, run in 'network mode'
if ($remote && $mbufferPort && $mbuffer ne 'off'){
# if mbuffer port is set for this destination (or inherited by it
# from the legacy "mbuffer" setting), we run in 'network mode'
if ($remote && $dstMbufferPort && $dstMbuffer ne 'off' && $srcMbuffer eq 'off'){
# Not a fatal situation - we have SSH anyway, to spawn that remote
# mbuffer. The "problem" is that we would encrypt the data by SSH,
# which may be a bit of useless overhead in a trusted LAN.
$self->zLog->warn('WARNING: remote destination ' . $dstName
. ' at ' . $remote . ' asked for port-to-port mbuffer connection,'
. ' but no local path to mbuffer program was set on source.'
. ' Will try to use the usual SSH tunnel for data instead.');
}

if ($remote && $dstMbufferPort && $dstMbuffer ne 'off' && $srcMbuffer ne 'off'){
my $recvPid;

my @recvCmd = $self->$buildRemoteRefArray($remote, [$mbuffer, @{$self->mbufferParam},
$mbufferSize, '-4', '-I', $mbufferPort], [@{$self->priv}, 'zfs', 'recv', @recvOpt, $dstDataSetPath]);
my @recvCmd = $self->$buildRemoteRefArray($remote, [$dstMbuffer, @{$self->mbufferParam},
$dstMbufferSize, '-4', '-I', $dstMbufferPort], [@{$self->priv}, 'zfs', 'recv', @recvOpt, $dstDataSetPath]);

my $cmd = $shellQuote->(@recvCmd);

Expand Down Expand Up @@ -751,8 +764,8 @@ sub sendRecvSnapshots {
$remote =~ s/^[^@]+\@//; #remove username if given
$self->zLog->debug("receive process on $remote spawned ($pid)");

push @cmd, [$mbuffer, @{$self->mbufferParam}, $mbufferSize,
'-O', "$remote:$mbufferPort"];
push @cmd, [$srcMbuffer, @{$self->mbufferParam}, $srcMbufferSize,
'-O', "$remote:$dstMbufferPort"];

$cmd = $shellQuote->(@cmd);

Expand Down Expand Up @@ -781,10 +794,14 @@ sub sendRecvSnapshots {
$subprocess->ioloop->start if !$subprocess->ioloop->is_running;
}
else {
my @mbCmd = $mbuffer ne 'off' ? ([$mbuffer, @{$self->mbufferParam}, $mbufferSize]) : () ;
my $srcMbCmd = [$srcMbuffer, @{$self->mbufferParam}, $srcMbufferSize];
my @dstMbCmd = $dstMbuffer ne 'off' ? ([$dstMbuffer, @{$self->mbufferParam}, $dstMbufferSize]) : () ;
my $recvCmd = [@{$self->priv}, 'zfs', 'recv' , @recvOpt, $dstDataSetPath];

push @cmd, $self->$buildRemoteRefArray($remote, @mbCmd, $recvCmd);
if ($srcMbuffer ne 'off') {
push @cmd, $srcMbCmd;
}
push @cmd, $self->$buildRemoteRefArray($remote, @dstMbCmd, $recvCmd);

my $cmd = $shellQuote->(@cmd);
print STDERR "# " . ($self->noaction ? "WOULD # " : "" ) . "$cmd\n" if $self->debug;
Expand Down

0 comments on commit eadd3ec

Please sign in to comment.