-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btrfs scrub start -r
tries to write data unless mounted read-only
#934
Comments
It is expected that Btrfs tries to write to the block devices, even when mounting ro (log replay, etc). I do not think btrfs can run on a ro block device. |
The man-page - btrfs-scrub(8) - about the
As i wrote, everything's fine when mounted ro. No complaints about writes to an ro-device. |
There are multiple agents here. The documentation could be clearer. The scrub is read-only, i.e. errors found in blocks that are read and verified by the scrub ioctl are not corrected. The filesystem is read-write. Errors have been found while running the scrub, so the device stats are incremented. These updates to the device stats items will be committed in the next transaction, which is what failed in the logs above. Also, scrub reads the filesystem metadata trees in order to get device maps, extent maps, and data csums for verification. If any of these reads fail, the filesystem will attempt to correct these pages on disk by writing the correct data over the incorrect data. If any other process reads the filesystem while the scrub is running, the other process is not affected by the Try it with the preferred metadata patches and set up data-only and metadata-only drives. You should see that |
That's what I guessed too after finding out I forgot to mount ro the first time. A process running with an ro option causing writes was still scary enough for me to report it.
I agree. While this might be a corner-case, I still think it should be noted, that the fs itself could still try to fix stuff by itself. |
Firstly, if scrub finds no error, it should not trigger any write into the fs, thus even if the target block device is RO, and no data/metadata/superblock errors are found, scrub itself will not trigger the write. According to your output, at least scrub found no error so far, so the write is not triggered by scrub itself. The direct cause is that, there is a transaction needs to be committed, and we failed to commit the transaction. The root cause is that, since scrub is done on commit roots, to avoid write and scrub on the same block group, we mark the current scrub target as read-only. But that marking read-only operation needs to start a transaction and even force a chunk allocation, which will need to join/start a new transaction, which will cause new metadata to be created and written back. That's why scrub provides read-only mode, which will not try to allocate a chunk (aka, update the metadata) during scrub. Then talking about why if your fs is mount RO, even a RW scrub will be fine. That's because the function So there is nothing special, nothing related to whatever patchset, it's just some corner cases related to scrub implementation.
And your report matches the first RW scrub on RW fs case, thus write is expected. |
That statement is not true. I clearly stated that i started an RO scrub on an RW fs which resides on an RO device. Worth mentioning: Since you already closed this issue, I guess you do not deem "RO scrub may cause writes to the underlaying device unless mounted RO" worthy enough to be noted? |
OK, the problem is in the Thus a RO scrub will trigger a transaction on RW mounted fs. I can add an extra check to avoid this. Although on such RW mounted fs, you may hit -ENOSPC if there is not much space left. |
This sounds unintentional and IMHO deserves to be fixed. Thank you very much!
This seems like a very minor inconvenience. |
Unfortunately the code is not that easy to handle the RO scrub on RW mount:
So this means even if we skip the chunk allocation part, we will have an empty transaction to commit and have to update the super block. But if we skip holding a transaction and continue, it means we will have the chance to conflict and corrupt the target block group. I'd go with a doc update for now, to warn about the modification to the fs. |
[BUG] There is a bug report that read-only scrub on a read-write fs still causes writes into the fs, and that will be caught if there is a read-only block device among the storage stack. This will cause a kernel warning on failed transaction commit: BTRFS info (device dm-3): first mount of filesystem e18f0c40-88de-413f-9d7e-dcc8136ad6dd BTRFS info (device dm-3): using crc32c (crc32c-intel) checksum algorithm BTRFS info (device dm-3): using free-space-tree BTRFS info (device dm-3): scrub: started on devid 1 Trying to write to read-only block-device md127 btrfs_dev_stat_inc_and_print: 362 callbacks suppressed BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 2, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 3, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 4, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 5, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 6, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 7, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 8, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 9, rd 0, flush 0, corrupt 0, gen 0 BTRFS error (device dm-3): bdev /dev/mapper/data errs: wr 10, rd 0, flush 0, corrupt 0, gen 0 BTRFS: error (device dm-3) in btrfs_commit_transaction:2523: errno=-5 IO failure (Error while writing out transaction) BTRFS info (device dm-3 state E): forced readonly BTRFS warning (device dm-3 state E): Skipping commit of aborted transaction. BTRFS error (device dm-3 state EA): Transaction aborted (error -5) BTRFS: error (device dm-3 state EA) in cleanup_transaction:2017: errno=-5 IO failure BTRFS warning (device dm-3 state EA): failed setting block group ro: -5 BTRFS info (device dm-3 state EA): scrub: not finished on devid 1 with status: -5 [CAUSE] The root cause is inside btrfs_inc_block_group_ro(), where we need to hold a transaction handle, to prevent the transaction to be committed, until we hold ro_block_group_mutex. This will cause an empty transaction by itself, thus even if we can mark the block group read-only without any extra workload, we still need to commit the new and empty transaction. Unfortunately this means RO scrub on RW filesystem will always cause the fs to be updated. [FIX] The best fix is to make btrfs to avoid empty commit transaction, but even with that done, read-only scrub on rw mount can still cause real metadata updates (e.g. allocate new chunks and update device error statistics). It will be very complex to make read-only scrub to be fully read-only on a read-write btrfs. Thankfully read-only scrub on read-write mount with read-only device in the storage stack is pretty rare, thus a documentation update should be enough. Issue: kdave#934 Signed-off-by: Qu Wenruo <[email protected]>
Happened to me while readonly-checking a recovered md raid.
System information:
This lsblk snip visualizes the block device layers:
Note, that md127 was started in readonly mode.
When running
btrfs scrub -r
on the fs ofdata
(mounted rw), the kernel reports attempted writes to the read-only device md127 after about 10G of scrubbed data:Everything's fine when mounted ro.
The text was updated successfully, but these errors were encountered: