Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changes to single osd restart all osds on a server, and naming /dev/sdX devices #24

Open
cernceph opened this issue Jul 16, 2013 · 3 comments

Comments

@cernceph
Copy link

Today we have been testing drive failing/replacement and noticed a couple short-comings in the device.pp manifest:

  1. When a disk is replaced, the ceph.conf will change and this results in a service restart of all the osd's in a server. (because of the subscribe => Concat /etc/ceph/ceph.conf in each osd service). These restarts result in a noticeable disruption. Ideally we want only to start the affected service, not all of them!
  2. Using the /dev/sdX names for disks isn't ideal, since when a replacement drive is inserted it will get a new name (e.g. today we pulled sdq, then reinserted it and it got sdab). We then need to do one of
    (a) change our host manifests to add osd::device (sdab), but this isn't good since the device will return to sdq after a reboot, or
    (b) reboot the server, to get the device called sdq once again.

Do people have experience already with better practices to prevent these two problems?? Help is much appreciated!

Cheers, Dan
CERN IT

@dotwaffle
Copy link
Contributor

Would it be possible to reference by UUID? Or indeed, not partition. Useful
when you're just wanting to try on a spare LVM LV :)

M

@cernceph
Copy link
Author

Sure we could use /dev/disk/by-id or similar, but we would need to patch device.pp since it seems to expect /dev/sdX with lines like:

   $devname = regsubst($name, '.*/', '')

and

    command => "mkfs.xfs -f -d agcount=${::processorcount} -l \
size=1024m -n size=64k ${name}1",

I am wondering what people are using on clusters today... or is noone using this module in pseudo-production?

@dotwaffle
Copy link
Contributor

Perhaps just trip the "1" from the ${name}1, and specify a partition directly. Then, use a third party module to partition your disk -- or, in the case of moving to btrfs which can (IIRC) use a whole disk without a partition table being present, just use the disk "raw" as it were.

Just some thoughts, anyway!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants