Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(io): compare actual size in CheckIsFileSizeSame #49

Merged
merged 1 commit into from
Jul 2, 2024

Conversation

c3y1huang
Copy link
Contributor

Which issue(s) this PR fixes:

Issue longhorn/longhorn#4105

What this PR does / why we need it:

longhorn/longhorn-engine#1003 (comment)

Special notes for your reviewer:

Additional documentation or context

@c3y1huang c3y1huang self-assigned this Jul 1, 2024
@c3y1huang c3y1huang requested a review from derekbit July 1, 2024 01:46
Copy link

codecov bot commented Jul 1, 2024

Codecov Report

Attention: Patch coverage is 62.50000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 78.93%. Comparing base (d78642c) to head (949a952).

Files Patch % Lines
io/file.go 62.50% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #49      +/-   ##
==========================================
- Coverage   79.21%   78.93%   -0.29%     
==========================================
  Files          32       32              
  Lines        1506     1519      +13     
==========================================
+ Hits         1193     1199       +6     
- Misses        200      204       +4     
- Partials      113      116       +3     
Flag Coverage Δ
unittests 78.93% <62.50%> (-0.29%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@c3y1huang
Copy link
Contributor Author

I was expecting this TestCopyFile test case to fail after this PR due to this and this, but it didn't. I am not sure why. @derekbit any idea?

@c3y1huang
Copy link
Contributor Author

c3y1huang commented Jul 1, 2024

I cannot see how copySparseData in the unit test handles sparse data differently from io.Copy.

Achieves by ready and writing by intervals.

@c3y1huang c3y1huang force-pushed the fix-check-is-file-size-same branch from b00dcaf to eeff2ce Compare July 1, 2024 02:53
@derekbit
Copy link
Member

derekbit commented Jul 1, 2024

I was expecting this TestCopyFile test case to fail after this PR due to this and this, but it didn't. I am not sure why. @derekbit any idea?

IIUC, the test case creates a temp file with a string "content" and truncate the file to 20 bytes, so the file content is
'c' 'o' 'n' 't' 'e' 'n' 't' '\x00' .... '\x00'. Then, copy the content to the destination file.
Can you elaborate more on that why you think the test case should fail?

derekbit
derekbit previously approved these changes Jul 1, 2024
Copy link
Member

@derekbit derekbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@c3y1huang c3y1huang marked this pull request as ready for review July 1, 2024 08:06
@c3y1huang
Copy link
Contributor Author

c3y1huang commented Jul 1, 2024

IIUC, the test case creates a temp file with a string "content" and truncate the file to 20 bytes, so the file content is
'c' 'o' 'n' 't' 'e' 'n' 't' '\x00' .... '\x00'. Then, copy the content to the destination file.
Can you elaborate more on that why you think the test case should fail?

I’m a bit confused. I was expecting it to fail because we are now comparing actual sizes, and io.Copy doesn't support sparse file copy?

longhorn/longhorn-engine#1003 (comment)

@derekbit
Copy link
Member

derekbit commented Jul 1, 2024

IIUC, the test case creates a temp file with a string "content" and truncate the file to 20 bytes, so the file content is
'c' 'o' 'n' 't' 'e' 'n' 't' '\x00' .... '\x00'. Then, copy the content to the destination file.
Can you elaborate more on that why you think the test case should fail?

I’m a bit confused. I was expecting it to fail because we are now comparing actual sizes, and io.Copy doesn't support sparse file copy?

longhorn/longhorn-engine#1003 (comment)

I see.
The size of the sparse file is only 20 bytes. A file is composed of blocks, with each block typically being 512 bytes or 4096 bytes. Although the file's apparent size is 20 bytes, it actually occupies 8 blocks on the disk. This results in a total disk usage of 8 blocks x 512 bytes (disk block size). Therefore, the file is not sparse.

Note: Blocks is the number of blocks allocated on the disk, and the disk block size is 512 bytes. IO Block is the block size of the filesystem.

# echo content > test
# truncate --size 20 test
# stat test
  File: test
  Size: 20        	Blocks: 8          IO Block: 4096   regular file
Device: fd00h/64768d	Inode: 3407888     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2024-07-01 15:09:03.514369106 +0000
Modify: 2024-07-01 15:10:02.550371344 +0000
Change: 2024-07-01 15:10:02.550371344 +0000
 Birth: 2024-07-01 15:09:03.514369106 +0000

You can try

  • Write string "content"
  • Truncate the file to 4097 bytes
    Then, the file should be sparse. You can see the actual size is 8 blocks x 512 bytes (disk block size), but the apparent size is 4097.
# echo content > test
# truncate --size 4097 test
# stat test
  File: test
  Size: 4097      	Blocks: 8          IO Block: 4096   regular file
Device: fd00h/64768d	Inode: 3407888     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2024-07-01 15:09:03.514369106 +0000
Modify: 2024-07-01 15:16:08.726461833 +0000
Change: 2024-07-01 15:16:08.726461833 +0000
 Birth: 2024-07-01 15:09:03.514369106 +0000

longhorn/longhorn-4105

Signed-off-by: Chin-Ya Huang <[email protected]>
@c3y1huang
Copy link
Contributor Author

I see. The size of the sparse file is only 20 bytes. A file is composed of blocks, with each block typically being 512 bytes or 4096 bytes. Although the file's apparent size is 20 bytes, it actually occupies 8 blocks on the disk. This results in a total disk usage of 8 blocks x 512 bytes (disk block size). Therefore, the file is not sparse.

Note: Blocks is the number of blocks allocated on the disk, and the disk block size is 512 bytes. IO Block is the block size of the filesystem.

# echo content > test
# truncate --size 20 test
# stat test
  File: test
  Size: 20        	Blocks: 8          IO Block: 4096   regular file
Device: fd00h/64768d	Inode: 3407888     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2024-07-01 15:09:03.514369106 +0000
Modify: 2024-07-01 15:10:02.550371344 +0000
Change: 2024-07-01 15:10:02.550371344 +0000
 Birth: 2024-07-01 15:09:03.514369106 +0000

You can try

  • Write string "content"
  • Truncate the file to 4097 bytes
    Then, the file should be sparse. You can see the actual size is 8 blocks x 512 bytes (disk block size), but the apparent size is 4097.
# echo content > test
# truncate --size 4097 test
# stat test
  File: test
  Size: 4097      	Blocks: 8          IO Block: 4096   regular file
Device: fd00h/64768d	Inode: 3407888     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2024-07-01 15:09:03.514369106 +0000
Modify: 2024-07-01 15:16:08.726461833 +0000
Change: 2024-07-01 15:16:08.726461833 +0000
 Birth: 2024-07-01 15:09:03.514369106 +0000

Thank you for the explanation, it's clear to me now.

Copy link
Member

@derekbit derekbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@derekbit derekbit merged commit ad3e7f0 into longhorn:main Jul 2, 2024
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants