-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unaligned access on PowerPC: forgotten macros in common_defs.h
#362
Comments
I mainly focus on x86 and arm; the PowerPC support isn't something I've really worked on yet. From searching around it sounds like all PowerPC processors support unaligned memory accesses. Do you agree? I'm not familiar with the differences between the various PowerPC macros. I'm surprised that |
@ebiggers Thank you for responding!
AFAIK, yes, they do, including G4/G5. (I am not a hardware expert, but I can look for documentation on this.)
PowerPC has OS-specific macros. I am not too sure which exactly are used for Linux, BSD and AIX (but I think at least AIX has its own unique), but I know for sure what Darwin uses (and, well, BeOS). Adding AIX, apparently, uses Apple code used a whole bunch of macros for PowerPC detection: https://opensource.apple.com/source/WTF/WTF-7601.1.46.42/wtf/Platform.h.auto.html
Thank you, I can do that. P. S. If you will add optimized code for PowerPC, there we need to detect support for specific ISA. Specifically, more recent POWER CPUs have VSX support, while G4 and G5 have Altivec only, and some early POWER and G3 have none at all. Basic info can be checked here: https://en.wikipedia.org/wiki/Power_ISA (specific one is contained in IBM specifications docs, which are publicly available). |
By the way, how could I test if adding unaligned access support for PowerPC actually works here or at least does not break anything or introduce any unwanted effects (assuming the build itself succeeds)? |
You could run the If you want you can run the tests too, using |
Sure, will do it now. UPD. How long running
Here it is with 0, initially it was with default 6, but behavior is the same. |
FWIW, here are tests on PowerPC. There is no meaningful difference, and time between runs with the same code can differ more that between some runs with different code (random variation obscures any deterministic difference).
But another run with the same patch:
|
Somewhat ironically, tests run slower – and much so – on M1:
Or they just skip test cases on PowerPC, and result is misleading in this sense? |
@ebiggers Going by documentation,
|
The ctest suite (which is separate from the benchmark program) only tests for correctness, not performance.
Perhaps you invoked it with no arguments, causing it to read its input file from standard input? |
@ebiggers Is there documentation how to test it with benchmarks? |
No. It's pretty straightforward, though; just build the benchmark program ( |
The code here defines
UNALIGNED_ACCESS_IS_FAST
forppc64
on Linux/BSD:libdeflate/common_defs.h
Lines 402 to 419 in 275aa51
But does not define it for
ppc
(any OS) andppc64
on Darwin.Is this intended or a mere omission? If this should apply to Big-endian PowerPC, then it is needed to add at least
__POWERPC__
(Darwin both 32- and 64-bit) there and probably__powerpc__
(Linux 32-bit?).The text was updated successfully, but these errors were encountered: