You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I was doing some backups of virtual disk images (which are sparse), and I noticed that 'rsync' was doing a great job of making the target file sparse when using the '--sparse' flag - sometimes even resulting in a file with a smaller footprint than the original - but it still seemed to be scanning the entire apparent size of the source file (not sure what it was actually transferring across the wire).
For example, I have some disk images for VMs that are 1TB apparent size, but only about 100MB actual size. With 'rsync --sparse ...', these were taking about an hour to transfer from a USB3 ssd to an onboard nvme.
By comparison, the same files were taking about one minute (same source and dest) when using 'cp --sparse=always ...'
Seeing that 'dd' also has a 'conv=sparse' parameter, I tried it as well, with about the same timing results as 'rsync'.
I compared the code between 'cp' and 'rsync' a bit, and I might be able to make the modifications to 'rsync' that would allow it to handle the situation more like 'cp', but I'm sure there's more nuance than this initial cursory scan imparted to my brain.
So, I was wondering if this has already been attempted by someone who has a better grasp of the nuances involved in properly handling sparse files. It sure would make a lot of people happier if 'rsync' (and 'dd' for that matter) could transfer sparse files as efficiently as does 'cp'.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi,
I was doing some backups of virtual disk images (which are sparse), and I noticed that 'rsync' was doing a great job of making the target file sparse when using the '--sparse' flag - sometimes even resulting in a file with a smaller footprint than the original - but it still seemed to be scanning the entire apparent size of the source file (not sure what it was actually transferring across the wire).
For example, I have some disk images for VMs that are 1TB apparent size, but only about 100MB actual size. With 'rsync --sparse ...', these were taking about an hour to transfer from a USB3 ssd to an onboard nvme.
By comparison, the same files were taking about one minute (same source and dest) when using 'cp --sparse=always ...'
Seeing that 'dd' also has a 'conv=sparse' parameter, I tried it as well, with about the same timing results as 'rsync'.
I compared the code between 'cp' and 'rsync' a bit, and I might be able to make the modifications to 'rsync' that would allow it to handle the situation more like 'cp', but I'm sure there's more nuance than this initial cursory scan imparted to my brain.
So, I was wondering if this has already been attempted by someone who has a better grasp of the nuances involved in properly handling sparse files. It sure would make a lot of people happier if 'rsync' (and 'dd' for that matter) could transfer sparse files as efficiently as does 'cp'.
Thoughts?
Thanks,
Mylo
Beta Was this translation helpful? Give feedback.
All reactions