Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix libvendor/osm_vendor_ibumad.c so clang -Werror does not complain #14

Closed
wants to merge 1 commit into from

Conversation

hnrose
Copy link
Contributor

@hnrose hnrose commented Apr 21, 2019

Clang doesn't like getting pointers from packed struct members,
even if aligned

Pointed-out-by: Nicolas Morey-Chaisemartin [email protected]

Signed-off-by: Hal Rosenstock [email protected]

@hnrose hnrose mentioned this pull request Apr 21, 2019
@hnrose
Copy link
Contributor Author

hnrose commented Apr 21, 2019

@nmorey Is libibumad from rdma-core being picked up ?

602osm_vendor_ibumad.c:746:56: error: incompatible pointer types passing '__be64 *'
603 (aka 'unsigned long long *') to parameter of type 'uint64_t *'
604 (aka 'unsigned long *') [-Werror,-Wincompatible-pointer-types]
605 ...if ((r = umad_get_ca_portguids(p_vend->ca_names[ca], &portguids[0],
606 ^~~~~~~~~~~~~
607/usr/include/infiniband/umad.h:166:59: note: passing argument to parameter
608 'portguids' here
609int umad_get_ca_portguids(const char *ca_name, uint64_t * portguids, int max);
610 ^

This declaration of umad_get_ca_portguids looks to be from umad.h prior to rdma-core.

I also see clang complaints of packed member alignment issue with p_mad->trans_id

Clang doesn't like getting pointers from packed struct members,
even if aligned

Pointed-out-by: Nicolas Morey-Chaisemartin <[email protected]>

Signed-off-by: Hal Rosenstock <[email protected]>
@nmorey
Copy link
Contributor

nmorey commented Apr 23, 2019

@hnrose I'm guessing this is picking a debian version and not necessarily the latest.

@hnrose hnrose force-pushed the clangerr1 branch 4 times, most recently from 0e9b7a3 to 4508e3b Compare April 23, 2019 14:26
@hnrose
Copy link
Contributor Author

hnrose commented Apr 23, 2019

@nmorey libibumad appears to be picked up from the following:
http://us-east-1.ec2.archive.ubuntu.com/ubuntu xenial/universe amd64 libibumad3 amd64 1.3.10.2-1 [16.7 kB]
which is the old (pre rdma-core) one. Any idea how to make it pick up libibumad from some released rdma-core ?

This is just a perceived incompatible pointer type by clang.

Main issue is clang not liking the alignment of ib_mad_t, specifically when going after the transaction ID.

@nmorey
Copy link
Contributor

nmorey commented Apr 23, 2019

@hnrose: I checked and xenial is just using a very very old version of everything...
This can be solved by either using a container within travis to build on a newer release
Do a pre-build step that installs a specific rdma-core release.

I'll look into that

@nmorey
Copy link
Contributor

nmorey commented Apr 23, 2019

@jgunthorpe Any idea on how to deal with this ?
Could we publish the Xenial packages with the releases so they can be used by other github projects ?
I'd rather avoid pulling all the cbuild stuff from rdma-core just for that.

Building rdma-core debian packages locally then installing them works but it's not very clean...

@jgunthorpe
Copy link
Member

Best is to just not use travis, it is horrible for this kind of stuff :( I've been slowly working to replace travis for rdma-core, but haven't got it yet

Also, this patch looks kind of bonkers, foo and &foo[0] are the same thing... Not sure what 'packed' has to do with this

The reason you can't take the address of a packed member is because it is not aligned, it is simply an error and you shouldn't ever do it - it will crash at runtime on ARM. If the member is actually aligned then don't use packed, but use the proper attribute aligned to tell the compiler what is happening and it won't complain.

@nmorey
Copy link
Contributor

nmorey commented Apr 23, 2019

Agreed that travis sucks. But it's easy enough to setup a minimal validation set.

But yes. The PACKED attribute should probably be dropped on most of these structs

@hnrose
Copy link
Contributor Author

hnrose commented Apr 23, 2019

[jgunthorpe wrote:]
Also, this patch looks kind of bonkers, foo and &foo[0] are the same thing... Not sure what 'packed' has to do with this

Yes, I know they're the same thing; the change from foo -> &foo[0] was just a test to see if clang would stop complaining about the incompatible pointer type.

[jgunthorpe wrote:]
The reason you can't take the address of a packed member is because it is not aligned, it is simply an error and you shouldn't ever do it - it will crash at runtime on ARM. If the member is actually aligned then don't use packed, but use the proper attribute aligned to tell the compiler what is happening and it won't complain.

The structure packing in OpenSM has been there for long time and one needs to be very careful about undoing it. This is a bigger effort which should be done and I'll enter an issue for this.

I think I'm going to drop this specific patch.

@jgunthorpe
Copy link
Member

Generally all MAD structures are aligned to 4 bytes, so what we did for srp_daemon/etc is to increase the alignment and use pahole & static_assert to validate the struct layout didn't change.

@nmorey
Copy link
Contributor

nmorey commented Apr 24, 2019

I did a quick check with pahole. A lot fo struct just change size because they get 4 or 8B aligned which should not be an issue. But some get some internal padding between fields so we'll have deal carefully here

@hnrose
Copy link
Contributor Author

hnrose commented Apr 24, 2019

@nmorey Most MAD attributes in IBA were spec'd to follow natural alignment but there are a small but significant number which do not. AFAIR NodeIndo is one of those because the GUIDs are not 64 bit aligned. There are others I've run across over time. Do you have a list of the ones which pahole found ?

I am tracking relevant comments for this in issue #15 - Remove structure packing where not needed

@jgunthorpe
Copy link
Member

Generally the MADs have a natural alignment of 4 bytes and 64 bit values are only aligned to 4 bytes, not 8. This is why we ended up defining umad_gid as aligned(4) so it was compatible with MAD structures that have 4 byte GID alignment.

When we did srp_daemon it took some fussing with attributes and other adjustments to make the structs have the same layout with a higher alignment than packed. But pahole is reliable and if it says the struct has the same layout, then it does.

@rleon rleon closed this Apr 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants