Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gluster Server compatibility regression with xattr #16828

Open
darkain opened this issue Dec 2, 2024 · 30 comments
Open

Gluster Server compatibility regression with xattr #16828

darkain opened this issue Dec 2, 2024 · 30 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@darkain
Copy link

darkain commented Dec 2, 2024

System information

Type Version/Name
Distribution Name FreeBSD
Distribution Version 14.1-RELEASE
Architecture AMD64
OpenZFS Version zfs-2.2.4-FreeBSD_g256659204

Describe the problem you're observing

5c00613

This commit broke compatibility with the Gluster server application. Gluster relies heavily upon xattrs, more specifically, the "trusted" namespace xattrs. Without these, Gluster server fails to function at all.

gluster/glusterfs#4340

This issue has been open for months over on the Gluster GitHub repo, however I believe it should also be noted here as a regression in ZFS.

Describe how to reproduce the problem

A quick repro example without installing Gluster:

setextattr user trusted.test test /zfs/test

Alternatively: Installer Gluster and attempt to create a volume. gluster volume create will fail with the following error message:

Include any warning/errors/backtraces from the system logs

setextattr: /zfs/test: failed: Invalid argument

NOTE: Gluster produces a similar error with a different path.

Testing

ZFS_XA_NS_PREFIX_MATCH(LINUX_TRUSTED, name) || \

On my personal machine, I removed the single ZFS_XA_NS_PREFIX_MATCH(LINUX_TRUSTED, name) line from ZFS_XA_NS_PREFIX_FORBIDDEN, recompiled, and Gluster Server started to function as normal again.

Upsteam Docs

https://man7.org/linux/man-pages/man7/xattr.7.html

       Trusted extended attributes are visible and accessible only to
       processes that have the CAP_SYS_ADMIN capability.  Attributes in
       this class are used to implement mechanisms in user space (i.e.,
       outside the kernel) which keep information in extended attributes
       to which ordinary processes should not have access.

This directly states that it is intended to be used by trusted user-space applications, which Gluster Server most likely would fall into. I'm uncertain why ZFS would want to remove this functionality entirely. I understand wanting to unify feature sets between Linux, FreeBSD, and others, but to flat out "forbid" flags like this which is actively used seems to be the wrong decision. At worst, this should be gated with a dataset attribute of some kind, rather than a hard-coded "FORBIDDEN"

@darkain darkain added the Type: Defect Incorrect behavior (e.g. crash, hang) label Dec 2, 2024
@amotin
Copy link
Member

amotin commented Dec 2, 2024

@darkain If you read FreeBSD's extattr(2) and extattr(9) man pages, you will see that respective extended attribute APIs on FreeBSD are taking namespace specification as separate parameter. What ZFS_XA_NS_PREFIX_FORBIDDEN() macro does is preventing namespace specification as part of the name, which is not practiced on FreeBSD and would cause potential (security) problems would the pool be moved between operating systems or may be in other cases. There are two namespaces defined now on FreeBSD EXTATTR_NAMESPACE_SYSTEM and EXTATTR_NAMESPACE_USER. Man page allow file systems to define additional namespaces, but as I can see, there is not trusted namespace currently. You are welcome to start implementing it from FreeBSD side, if you need it. Or you can patch Gluster to use user or system namespace on FreeBSD (I have no idea what does it store there and why it should be "trusted").

@snajpa
Copy link
Contributor

snajpa commented Dec 3, 2024

This should probably be fixed in the OS+documentation then, it confuses people - there seem to exist setups with OpenZFS 2.2 + FreeBSD + GlusterFS, which rely on the ability to store "trusted" xattrs - I'd wager that we're kinda responsible for this state of things, so it should probably be us to do something about it

take a look at this StackExchange post

Now, the FreeBSD extattr(9) manpage says:

[...] The following
two namespaces are defined universally, although individual file sys-
tems may implement additional namespaces, or not implement these name-
spaces: EXTATTR_NAMESPACE_USER, EXTATTR_NAMESPACE_SYSTEM.

Which, from the point of view of OpenZFS 2.2 users, we did.

So I'd say it's not exactly clear it's OK to say they should change their implementation - we are somewhat responsible for this, aren't we :) It would make sense to me to allow for seamless OS transition between Linux and FreeBSD when it comes to Gluster (and other trusted xattrs users) going forward.

@amotin
Copy link
Member

amotin commented Dec 3, 2024

@snajpa If you compare Linux and FreeBSD extended attributes APIs, you'll see that Linux does not have separate namespace argument, but uses prefix within the name. FreeBSD uses separate argument, so passing prefix with the name won't make it any special. Passing "trusted" or "system" any other namespace as prefix there in combination with EXTATTR_NAMESPACE_USER won't make them protected by OS or mean anything special, so this practice should be discouraged, which the mentioned patch has done. To build on FreeBSD Gluster obviously has to use different API, so I don't see why it should not use it properly. If that API needs an extension (addition of new namespace with respective logic), then it has to be done on the OS side, including providing expected guaranties. Though I haven't dig this area too deep myself and would be happy if somebody else lead the project, just with a proper level of responsibility and bigger picture vision.

@snajpa
Copy link
Contributor

snajpa commented Dec 3, 2024

@amotin I'm not familiar with FreeBSD kernel source, but that ^ StackExchange post would suggest it's up to the FS to implement xattrs support more or less completely - as I said, I don't know FreeBSD code, so am taking a leap from that tmpfs note in that post. It's cetainly that way on Linux side - it's upto the FS to uphold the guarantee that only CAP_SYS_ADMIN can access trusted xattrs - so if previously FreeBSD OpenZFS implemented trusted namespace with this guarantee by sharing that logic with the Linux side, I think that was the right thing to do, even if trusted xattrs aren't standardized in FreeBSD. It's actually a compatibility design win it could (and used to) work and it didn't require any kernel change, isn't it?

Trusted xattrs are for userspace use, but unlike user.*, regular users just get no access at all.

Gluster uses this to store its own state it needs for managing replication, etc. - in the early days of our vpsAdminOS we didn't support trusted xattrs in our containers and I got Gluster to work by just s/trusted\./user\./g on the whole codebase, but that was in the Gluster 2.x days and once that stopped working b/c their code got more complicated, I just went and extended what OpenZFS used to do in a dedicated function to allow first-level user namespace's root users to manage their trusted attrs, cross-container safety guarantee provided by mount namespace. A bit off topic, but demonstrates how it's up the FS implementation to uphold the trusted xattrs semantics.

That is to say that I think that OpenZFS can support this first and other FreeBSD FS can follow if there's ever interest in that - one isn't a dependency of the other + both are not mutually exclusive, what do you think?

@amotin
Copy link
Member

amotin commented Dec 3, 2024

@snajpa You are missing my focus. I am not arguing about usefulness of "trusted" attributes implementation, I just don't know enough. I am talking that FreeBSD APIs expect namespace to be passed as separate parameter. It should not be sent as part of the name. So the hack you've done before on FreeBSD should actually look as s/trusted\.//g with addition of EXTATTR_NAMESPACE_USER argument or EXTATTR_NAMESPACE_TRUSTED once it is implemented.

@snajpa
Copy link
Contributor

snajpa commented Dec 3, 2024

@snajpa You are missing my focus.

Ah I see, so the problem is what number to use for it and how to guarantee that won't have to change in the future, right? Then I'd be interested in how the old code did it and if there was any discussion around that. I'll take a look tomorrow, out of curiosity

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

@amotin I just went through current git tip of GlusterFS and it would seem that it uses EXTATTR_NAMESPACE_USER on FreeBSD... so I don't think there's any action needed from us, if it doesn't work on current FreeBSD/OpenZFS combo, then it most likely doesn't work with UFS either, which it probably should, at least from the quick glance at the Gluster+kernel code

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

@darkain if you'd be so kind and could get a syscall trace from glusterfsd when this failure happens so we could confirm it doesn't reach for the trusted namespace, that'd be awesome - I think it doesn't and if there's a bug it's probably unrelated to the lack of support for trusted xattrs namespace on FreeBSD

@amotin
Copy link
Member

amotin commented Dec 4, 2024

@snajpa But did they remove "trusted." prefix from the attribute name or they are trying to specify two namespaces in two ways?

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

@amotin they have these OS-dependent ifdefs which I think imply that there was never any interop between Gluster on Linux and Gluster on FreeBSD - it would seem that Gluster under FreeBSD puts everything under the EXTATTR_NAMESPACE_USER and then still uses the prefixes in the xattr names as it would on Linux

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

umm maybe we're just trying to be smart in the code and actually disallow the combination of EXTATTR_NAMESPACE_USER + trusted.* in the xattr name?

@amotin
Copy link
Member

amotin commented Dec 4, 2024

@snajpa Yes. I think so. And it seems Gluster uses all sort of prefixes:

#define GF_NAMESPACE_KEY "trusted.glusterfs.namespace"

#define GF_XATTR_LINKINFO_KEY "trusted.distribute.linkinfo"
#define GFID_XATTR_KEY "trusted.gfid"
#define PGFID_XATTR_KEY_PREFIX "trusted.pgfid."
#define GFID2PATH_VIRT_XATTR_KEY "glusterfs.gfidtopath"
#define GFID2PATH_XATTR_KEY_PREFIX "trusted.gfid2path."
#define GFID2PATH_XATTR_KEY_PREFIX_LENGTH 18
#define VIRTUAL_GFID_XATTR_KEY_STR "glusterfs.gfid.string"
#define VIRTUAL_GFID_XATTR_KEY "glusterfs.gfid"
#define GF_XATTR_MDATA_KEY "trusted.glusterfs.mdata"

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

@amotin yes those virtual ones glusterfs.* seem to be used only during run-time, they're filtered out and not saved to disk

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

olol I should have just read through the commit OP linked :D it's all in there - and now that we know that Gluster would like to store xattrs starting with trusted.* it's obvious where the problem is. Not quite obvious what to do about it, hm (still looking).

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

Maybe it could be as simple as this?

diff --git a/include/sys/fs/zfs.h b/include/sys/fs/zfs.h
index 1676020d0..ac01b3ad2 100644
--- a/include/sys/fs/zfs.h
+++ b/include/sys/fs/zfs.h
@@ -1942,8 +1942,6 @@ enum zio_encrypt {
 #define        ZFS_XA_NS_LINUX_SYSTEM_PREFIX_LEN       strlen("system.")
 #define        ZFS_XA_NS_LINUX_TRUSTED_PREFIX          "trusted."
 #define        ZFS_XA_NS_LINUX_TRUSTED_PREFIX_LEN      strlen("trusted.")
-#define        ZFS_XA_NS_LINUX_USER_PREFIX             "user."
-#define        ZFS_XA_NS_LINUX_USER_PREFIX_LEN         strlen("user.")
 
 #define        ZFS_XA_NS_PREFIX_MATCH(ns, name) \
        (strncmp(name, ZFS_XA_NS_##ns##_PREFIX, \
@@ -1953,8 +1951,7 @@ enum zio_encrypt {
        (ZFS_XA_NS_PREFIX_MATCH(FREEBSD, name) || \
            ZFS_XA_NS_PREFIX_MATCH(LINUX_SECURITY, name) || \
            ZFS_XA_NS_PREFIX_MATCH(LINUX_SYSTEM, name) || \
-           ZFS_XA_NS_PREFIX_MATCH(LINUX_TRUSTED, name) || \
-           ZFS_XA_NS_PREFIX_MATCH(LINUX_USER, name))
+           ZFS_XA_NS_PREFIX_MATCH(LINUX_TRUSTED, name)
 
 #ifdef __cplusplus
 }

OK maybe not entirely that simple but by forking this macro so we have a Linux and FreeBSD version it might work, maybe... the zfs_xattr_compat toggle certainly does not help :D

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

@amotin so my idea is to instead of forbidding certain prefixes entirely on FreeBSD, move it always into the user.* xattr namespace - so that trusted.whatever would end up stored as user.trusted.whatever and that exactly would be retrieved when the user asks for trusted.whatever. At least when we're on FreeBSD and not in zfs_xattr_compat mode - here's the PoC(compiles but stupid mistakes & omissions inside) - I don't think I can do that for Linux but I also don't think similar situation could arise there. In this PoC, trusted.whatever is renamed to user.trusted.whatever, but user.whatever stays the same and doesn't get prefixed, that way, since zfs_xattr_compat = 0 is the only case when this applies, the user.whatever is stored on disk as user.user.whatever (if I didn't handle this we'd end up with user.user.user.whatever on disk :D)

The problem with this approach is it that it forces Gluster+OpenZFS+FreeBSD users to take action when migrating onto a version with this integrated (but still better than not being able to use Gluster+OpenZFS+FreeBSD at all going fwd).

There's also the possibility of also allowing trusted.* on FreeBSD to be stored and enforced the Linux way, if user on FreeBSD requests trusted.* with EXTATTR_NAMESPACE_USER - we could store it as trusted.* instead of user.* and enforce the root user semantics. This wouldn't require any action from users, AFAICT. Patch for that would probably require something similar as the PoC above but also handling the trusted.* case, so it would be longer. The downside of this approach is that it might surprise those on FreeBSD that would like to be able to store trusted.whatever with EXTATTR_NAMESPACE_USER with only EXTATTR_NAMESPACE_USER original semantics. That might be solved with yet-another-tunable tho :D

Don't really have a preference but if 'user delight' is one of OpenZFS goals then the second path might be preferable. Or if anyone can think of a third... :))

@amotin
Copy link
Member

amotin commented Dec 4, 2024

@snajpa The whole goal of that project was to allow compatibility between different OS'es. Adding duplicate prefixes aside if being a mess would not allow that, but instead would make already existing mess even worse. What Gluster does on FreeBSD I find just broken, and instead of conforming to it we might provide them a way to do it properly. That might include addition of trusted attributes support to FreeBSD (I have no idea why it is not there, may have a reason or just be an omission), and patching the code in OS-dependent ifdefs to cut the prefix from the attribute name and passing it as via the namespace parameter instead of hardcoding EXTATTR_NAMESPACE_USER.

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

@amotin if compatibility between OSes is the primary goal then trusted.* in EXTATTR_NAMESPACE_USER on FreeBSD could be stored and enforced the Linux way entirely from OpenZFS - and then EXTATTR_NAMESPACE_TRUSTED could be added and even that could be supported, on disk both of these variants end up trusted.*, there is no number on disk, is there :)

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

Gluster evidently doesn't really care if the CAP_SYS_ADMIN/root condition is enforced, but I think we have no business telling users in general they can't name their user or EXTATTR_NAMESPACE_USER xattrs user.whatever, trusted.whatever if that's what they'd like - even on FreeBSD that is allowed, isn't it?

@amotin
Copy link
Member

amotin commented Dec 4, 2024

@snajpa With compatibility I mean on disk. APIs are just different between the OS'es, period. You should not try to push things where they should not go. Yes, on-disk format for the attributes will likely include the prefixes. But we can not say that "we have no business telling users in general they can't name the xattrs user.whatever, trusted.whatever if that's what they'd like", since next step they try to read those attributes from user namespace and we would have to allow that, even though process that written them expected "trusted." to have limited access, unless we double-prefix them, which is not what we want either.

@snajpa
Copy link
Contributor

snajpa commented Dec 4, 2024

#16828 (comment) then this is my best answer, compatible Linux<=>FreeBSD - with that "trusted.*" in EXTATTR_NAMESPACE_USER + root enforced part being optional.

No idea about Illumos and if anyone would care to enforce the root boundary there - but it probably should be, if things shouldn't get more messy. This would IMHO decrease entropy on the user side, but it does increase code complexity, since we have to deal with the zfs_xattr_compat non-prefixed way of storing things.

@snajpa
Copy link
Contributor

snajpa commented Dec 5, 2024

There is the thing that if people now on-disk have trusted.stuff and now we only change the code on FreeBSD with EXTATTR_NAMESPACE_USER to be prefixed with user. in all cases, on disk that translates to user.trusted.stuff which is logical, but existing Gluster/FreeBSD installations will have to migrate somehow and we won't have the interop we had before 5c00613

I can already do user.trusted.stuff on Linux/ZFS on disk but Gluster doesn't expect that... I have no idea how do they enforce their Linux security model - if they even do - on other platforms, seems like they might not care.

But currently, in master, even if we ignore Gluster, I won't be able to change ondisk stored user.trusted.stuff from FreeBSD b/c it's forbidden, while normally able to both change and read user.untrusted.stuff (see the paradox?)

root@fbsd-14-1:~ # setextattr user untrusted.test lolo f
root@fbsd-14-1:~ # setextattr user trusted.test lolo f
setextattr: f: failed: Invalid argument

This code would be a whole lot easier if we could rely that there's only the new prefixed way of storing xattrs.

@amotin
Copy link
Member

amotin commented Dec 5, 2024

now we only change the code on FreeBSD with EXTATTR_NAMESPACE_USER to be prefixed with user. in all cases, on disk that translates to user.trusted.stuff

That is why I'd prefer adding the proper namespace to FreeBSD and patch the Gluster. It would avoid double-prefixing and zillion other problems. Code before 5c00613 added no prefixes so should migrate to addition of EXTATTR_NAMESPACE_TRUSTED just fine. Just Gluster would need to be updated together with OS.

@snajpa
Copy link
Contributor

snajpa commented Dec 5, 2024

@amotin OK, then user.trusted.stuff maybe not working will be the result of OpenZFS+Gluster+Illumos with zfs_xattr_compat=0 if that even exists (have no idea what Illumos ZFS code looks like these days)

@snajpa
Copy link
Contributor

snajpa commented Dec 5, 2024

It'd be awesome if there was a way out of this by upgrading within supported release versions of FreeBSD/illumos + OpenZFS with the trusted semantics respected with a non-hacky implementation.

@amotin Gluster doesn't even have to be updated if we don't disallow what we shouldn't really be disallowing, which I think any respectable fix for this situation should rectify too, I think there shouldn't be any limitation on using trusted.test on FreeBSD in user xattr namespace, when untrusted.test works too - but users also shouldn't expect the Linux access semantics to be there.

But in the situation where we fix this even while also delivering proper trusted xattr namespace support in FreeBSD, if anyone's running Gluster which isn't updated, the interop/migration problems still arise. So this needs some solution, it'd be best if that solution was in Gluster in their supported releases before this has a chance to surprise users (I'm almost certain there will be people who will be hit by this and so it needs at least some manual howto they can follow, if Gluster devs wouldn't want to automate it better).

@amotin
Copy link
Member

amotin commented Dec 5, 2024

@snajpa I would agree that trusted.test may be a valid user namespace xattr if it means that on Linux it will be accessed as user.trusted.test. But it is not true for Gluster, and I bet most of software that may appear doing the same. The way Gluster does it is incorrect and not cross-platform compatible! We should not stimulate improper behavior! We should provide a right way, and may be, if it does not create more problems, turn a blind eye on others. Broken software can be fixed once and the problem will be forgotten in few years, while supporting improper behavior will extend it into eternity.

I feel like I am repeating myself third time and you don't hear me. I've already told how I see the proper final solution. And if on top of that you can propose temporary transitional shims -- I'll give them a fair review. But I will not personally accept simple affirmation of improper API usage, considering the consequences.

@snajpa
Copy link
Contributor

snajpa commented Dec 5, 2024

@amotin yes it would be accessed as user.trusted.test on Linux. I do hear you, I'm just saying that currently we're in a business of telling FreeBSD users what they can't store in EXTATTR_NAMESPACE_USER in terms of keys, the core point is that this shouldn't have happened in the first place. I do recognize what you're saying, I just think that we should change this primarily - as this is our bug. Maybe you don't agree but I would tend to view this issue that way, IMO the OS documentation doesn't create any expectations for devs of userspace apps some names won't be allowed in EXTATTR_NAMESPACE_USER while others are fine...

Gluster's current improper usage of xattr APIs on top of OpenZFS across supported platforms will need to be addressed more broadly than just releasing a fixed version that uses new trusted xattr namespace in FreeBSD - because people's requirements are different, you can see that a lot of times in issues here too - some have policies that they can upgrade this or that component, but that one closer to the core of their business needs to undergo more rigorous testing and sometimes even certifications, etc., so I can imagine there could be situations where people are unwilling/unable to upgrade Gluster itself in lockstep with kernel+ZFS. Others might be able to upgrade Gluster but kernel+ZFS upgrade could be late - though I think we're okay in this particular scenario, because that upgrade would be blocked and the fact it could happen doesn't mean it needs to, in environments that care more about stability of ZFS+kernel than Gluster. For the former, some script for one-off migration between improper xattrs in user.trusted.* to trusted.* seems like it could address the issue. I don't think we can avoid having a solution here (by we I mean Gluster+OpenZFS teams together, not just us).

@amotin
Copy link
Member

amotin commented Dec 5, 2024

@snajpa I am not sure we have the user.trusted.* situation now. IIRC before this change ZFS on FreeBSD was not adding user. prefixes. After the change it returns error instead. Removal of the condition now would actually get us the problem, requiring some mentioned fixup later. Gluster could have #ifdef EXTATTR_NAMESPACE_TRUSTED to switch between the behaviors, but without some additional efforts it would have to be updated in lock-step with kernel+ZFS to work. Since most FreeBSD users likely use in-base ZFS version, having kernel to ZFS sync is not a big problem. Remains question of sync with Gluster. At worst we could require its rebuild (if the ifdef is planted there in advance) with new major FreeBSD version, but yea some nicer transition plan would be nice to have, unless it creates more problems than solves.

@snajpa
Copy link
Contributor

snajpa commented Dec 5, 2024

If we can accept that we'll only fix this slowly, if we say that it's ok for some parts of this to stay broken for a while - then we can use time to our advantage -

  • introduce trusted xattrs ns in FreeBSD
  • enable support for them in OpenZFS (and Gluster)
  • wait till most FreeBSD/Gluster deployments are up to date
  • only then fix the problem with user.trusted.*, user.user.*, user.system.* being unreachable with -EINVAL on FreeBSD

@amotin is that really okay though? it's a bet on the theory there's no other significant breakage in other software caused by the EINVALs :)

@amotin
Copy link
Member

amotin commented Dec 5, 2024

@snajpa "only then fix the problem with user.trusted., user.user., user.system.* being unreachable with -EINVAL on FreeBSD" -- I am still not sure it is a big problem, but yes. Though I would consider the problem solved after the step 2. We could add a tunable to allow disabling the check with a big warning that xattrs written that way might be inaccessible on different platforms or when the possibly broken software is finally fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

3 participants