Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No Discard/Trim/Unmap ? #230

Open
bbs2web opened this issue Jan 7, 2021 · 4 comments
Open

No Discard/Trim/Unmap ? #230

bbs2web opened this issue Jan 7, 2021 · 4 comments

Comments

@bbs2web
Copy link

bbs2web commented Jan 7, 2021

We got the iSCSI gateway working with Ceph Octopus but a Windows client sees the drive as a standard HDD so it won't trim.

The SUSE ceph-iscsi documentation has a myriad of options available, similar to how we're used to exporting functionality when running a small Debian VM which exports a RBD block device. I presume these are perhaps only available when the backstore is krbd instead of user:rbd (tcm-runner)?
https://documentation.suse.com/ses/6/html/ses-all/cha-ceph-as-iscsi.html

I presume this functionality is exclusive to SUSE with their target_core_rbd module and that I may have initially miss interpreted the following discussion where I understood kernel 4.16+ to include the necessary plumbing:
https://www.spinics.net/lists/ceph-users/msg53920.html

@dillaman
Copy link

dillaman commented Jan 7, 2021

The upstream Linux kernel does not contain SUSE's target_core_rbd and I am not aware of any other downstream kernels including it.

For tcmu-runner to denote that the LUN is non-rotational, you need your CRUSH map to have the pool's device class mapped to ssd or nvme device classes [1][2].

[1] https://github.com/open-iscsi/tcmu-runner/blob/master/rbd.c#L474
[2] https://ceph.io/community/new-luminous-crush-device-classes/

@bbs2web
Copy link
Author

bbs2web commented Jan 7, 2021

Thank you for this, I'll need to turn on debugging then as the device class for the pool is ssd.
Wouldn't we want to however always present the discard option, even when the pool uses hdd, as we'll want to pass through commands to reclaim deleted space?

@dillaman
Copy link

dillaman commented Jan 7, 2021

We also set the UNMAP bit in the VPD inquiry along with all the alignment and max length hints. If Windows is keying off HDD vs SSD, though, that's a different issue.

If you have a Linux initiator connected, you can run sg_inq -p 0xB0 /path/to/device and see the block limits VPD (and code 0xB1 for the characteristics VPD).

@bbs2web
Copy link
Author

bbs2web commented Jan 8, 2021

Are there any ways one can override the auto detection, to set either solid state, thin provisioned or hard disk?

After temporarily replacing the RBD image which a straight up replicated ssd one, Windows now picks up either as being solid state. Both images reside in a replicated SSD pool but one has it's data stored in an erasure coded hdd pool which has a ssd cache tier.

I'll try to reproduce the issue, is there any additional diagnostic information I should collect if I'm able to reproduce?

Both images now work perfectly:
image
image

Space reclamation is also confirmed to be working perfectly, before and after running Windows Disk Defragmenter on the image that stores data in the erasure coded hdd pool:

[admin@kvm7e ~]# rbd du iscsi/vm-169-disk-2
NAME           PROVISIONED  USED
vm-169-disk-2      500 GiB  1.1 GiB
[admin@kvm7e ~]# rbd du iscsi/vm-169-disk-2
NAME           PROVISIONED  USED
vm-169-disk-2      500 GiB  132 MiB

More for my own record, should I be able to reproduce the previous behaviour. The following is when working correctly:

Debian 10 client:
[root@debian ~]# sg_inq -p 0xB0 /dev/sdb
VPD INQUIRY: Block limits page (SBC)
  Maximum compare and write length: 1 blocks
  Optimal transfer length granularity: 0 blocks [not reported]
  Maximum transfer length: 1024 blocks
  Optimal transfer length: 1024 blocks
  Maximum prefetch transfer length: 0 blocks [ignored]
  Maximum unmap LBA count: 32768
  Maximum unmap block descriptor count: 4
  Optimal unmap granularity: 8192 blocks
  Unmap granularity alignment valid: true
  Unmap granularity alignment: 0
  Maximum write same length: 0xffffffff blocks
  Maximum atomic transfer length: 0 blocks [not reported]
  Atomic alignment: 0 [unaligned atomic writes permitted]
  Atomic transfer length granularity: 0 [no granularity requirement
  Maximum atomic transfer length with atomic boundary: 0 blocks [not reported]
  Maximum atomic boundary size: 0 blocks [can only write atomic 1 block]
[root@debian sdb]# cat /sys/block/sdb/queue/logical_block_size;
512
[root@debian sdb]# cat /sys/block/sdb/queue/physical_block_size;
512
[root@debian sdb]# cat /sys/block/sdb/queue/hw_sector_size;
512
[root@debian sdb]# cat /sys/block/sdb/queue/rotational;
0
[root@debian sdb]# cat /sys/block/sdb/queue/discard_max_bytes;
16777216
[root@debian sdb]# cat /sys/block/sdb/queue/discard_max_hw_bytes;
16777216
[root@debian sdb]# cat /sys/block/sdb/queue/minimum_io_size;
512
[root@debian sdb]# cat /sys/block/sdb/queue/optimal_io_size;
524288
[root@debian sdb]# cat /sys/block/sdb/queue/discard_granularity;
4194304
[root@debian sdb]# cat /sys/block/sdb/discard_alignment;
0
[root@debian sdb]# cat /sys/block/sdb/queue/discard_zeroes_data;
0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants