Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document alternatives to Mask<T, LANES> that guarantee layout. Fixes #332 #333

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

reinerp
Copy link

@reinerp reinerp commented Mar 8, 2023

Intended to fix #332.

crates/core_simd/src/masks.rs Outdated Show resolved Hide resolved
Comment on lines 92 to 94
/// For a type with layout guaranteed equivalent to `[T; LANES]`, use
/// `SIMD<T, LANES>`. For a type with layout guaranteed to use 1 bit per
/// lane (padded up to full bytes), use `LANES::BitMask`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the exact number of bytes is an implementation detail (we've used u64 before and we might choose it again... basically we want to retain the freedom to use an integer of a reasonable size, OR to use a byte array, depending on what strikes the best ergonomics/codegen balance). Right now, this seems like it might suggest it uses exactly the next byte in size. I am not sure what would be most clear to express our desired reservation while providing you the minimum of guarantees you want (or if you even consider that acceptable).

Co-authored-by: Jubilee <[email protected]>
@workingjubilee
Copy link
Member

workingjubilee commented Mar 9, 2023

I've been working on a generic integers RFC with the intention that in the future we can simply use those, but it might still be that rounding up significantly in raw size is in some cases still preferred, unfortunately.

@programmerjake
Copy link
Member

programmerjake commented Mar 9, 2023

true, e.g. on SVP64 all mask vectors being a single u64 internally (when LANES <= 64) is likely most efficient

///
/// For a type with layout guaranteed equivalent to `[T; LANES]`, use
/// `Simd<T, LANES>`. For a type with layout guaranteed to use 1 bit per
/// lane (padded up to full bytes), use `LANES::BitMask`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type is actually ToBitMask::BitMask: https://doc.rust-lang.org/std/simd/trait.ToBitMask.html#associatedtype.BitMask. It's not padded up to the next byte, but the next integer.

@workingjubilee
Copy link
Member

workingjubilee commented Mar 9, 2023

And in the original AVX512 case mentioned, the plausible masks use hardware representation of a u16 and a u64 (even if they are logically much smaller at times), and anything else is a peculiar choice.

It's not the most important thing that it be exactly-so, however, I suspect, because kmovb also exists? Hmm.

@reinerp
Copy link
Author

reinerp commented Mar 13, 2023

Thanks everyone for the helpful comments.

Per Caleb's comment #332 (comment) this guarantee is much less important to me than it was previously. Even so, my suspicion is that it's worth still providing some pointers in the documentation.

Indeed, ToBitMask::BitMask is the type I meant to refer to. The layout of that type appears to be guaranteed by this library, in that it is u8/u16/u32/u64 rather than an opaque wrapper type around one of those integer types. It seems to me that changing those types to something wider would constitute a breaking change in the library. Likewise, changing those types when compiling for different targets would likely break portability. Am I understanding this right?

So, I think the right approach is to document ToBitMask::BitMask as "1 bit per lane, padded up to the next integer type", but I'm not confident. Going with this assumption for now, I've updated the PR to state this.

Let me know what you think.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh is it? Okay! It's... been a while since I've read things and I've lately just had my head down on trying to figure out how we'll make dynamic byte swizzles work and catching up with other stuff.

As a nightly feature, the API is still unstable, but we do try to keep things broadly consistent. But yes, you're right, changing them overly much from target to target when compiling things would at the very least probably make things a bit confusing and thus also make it easy for people to make mistakes.

@programmerjake
Copy link
Member

do note that Simd<T, N> isn't guaranteed layout equivalent to [T; N] -- Simd often has greater alignment and may currently have some padding. #319 is trying to change that so Simd will never have padding (beyond what T itself has) and still may have greater alignment.

@programmerjake
Copy link
Member

also, ToBitmask won't necessarily be implemented for all lane counts, e.g. there currently is no integer type with >128 bits so Simd<T, 129> likely won't impl ToBitmask.

if you want a bitmask type that always exists no matter the lane count, use ToBitMaskArray, though that depends on the portable-simd crate's generic_const_exprs cfg feature, which isn't currently enabled by std.

@reinerp
Copy link
Author

reinerp commented Mar 13, 2023

Thanks folks. I have updated wording to mention "size" rather than "layout" guarantees. I have mentioned BitMaskArray too.

(I was not familiar with the all_lane_counts crate feature. Without it, ToBitMask::BitMask is a sufficient type.)

@programmerjake
Copy link
Member

do note that Simd<T, N> isn't guaranteed layout equivalent to [T; N] -- Simd often has greater alignment and may currently have some padding.

this means that "size guaranteed equal to [T; LANES], use Simd<T, LANES>" is currently incorrect since size_of::<Simd<T, N>>() >= size_of::<[T; N]>(), not size_of::<Simd<T, N>>() == size_of::<[T; N]>(). this is what I created #319 to fix.

@@ -88,6 +88,11 @@ impl_element! { isize }
/// The layout of this type is unspecified, and may change between platforms
/// and/or Rust versions, and code should not assume that it is equivalent to
/// `[T; LANES]`.
///
/// For a type with size guaranteed equal to `[T; LANES]`, use
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// For a type with size guaranteed equal to `[T; LANES]`, use
/// For a type with size guaranteed equal-or-greater-than `[T; LANES]`, use

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// For a type with size guaranteed equal to `[T; LANES]`, use
/// For a type equivalent to `[T; LANES]` (with possible trailing padding for alignment purposes), use

maybe equivalent is too strong? ideas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// For a type with size guaranteed equal-or-greater-than `[T; LANES]`, use

imho that loses most meaning, e.g. [u8; 5000] fits but is unlikely to be desired.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"equivalent" is indeed too strong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support avx512 bitmasks with dynamic feature detection
4 participants