Is it possible to have more efficient handling of sparse files (ala 'cp')? #219

mylostone · 2021-09-05T23:59:08Z

mylostone
Sep 5, 2021

Hi,
I was doing some backups of virtual disk images (which are sparse), and I noticed that 'rsync' was doing a great job of making the target file sparse when using the '--sparse' flag - sometimes even resulting in a file with a smaller footprint than the original - but it still seemed to be scanning the entire apparent size of the source file (not sure what it was actually transferring across the wire).

For example, I have some disk images for VMs that are 1TB apparent size, but only about 100MB actual size. With 'rsync --sparse ...', these were taking about an hour to transfer from a USB3 ssd to an onboard nvme.

By comparison, the same files were taking about one minute (same source and dest) when using 'cp --sparse=always ...'

Seeing that 'dd' also has a 'conv=sparse' parameter, I tried it as well, with about the same timing results as 'rsync'.

I compared the code between 'cp' and 'rsync' a bit, and I might be able to make the modifications to 'rsync' that would allow it to handle the situation more like 'cp', but I'm sure there's more nuance than this initial cursory scan imparted to my brain.

So, I was wondering if this has already been attempted by someone who has a better grasp of the nuances involved in properly handling sparse files. It sure would make a lot of people happier if 'rsync' (and 'dd' for that matter) could transfer sparse files as efficiently as does 'cp'.

Thoughts?

Thanks,
Mylo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to have more efficient handling of sparse files (ala 'cp')? #219

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Is it possible to have more efficient handling of sparse files (ala 'cp')? #219

mylostone Sep 5, 2021

Replies: 0 comments

mylostone
Sep 5, 2021