-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gcc's version of inline assembly for the maths helpers #51
base: master
Are you sure you want to change the base?
Conversation
df0aac6
to
1e79205
Compare
Mind explaining what exactly this helps with? Size? Performance? I'm not against or anything, just curious what motivated the the increased amount of code... |
Size of what, the compiled binary, or the source code? The former shouldn't change Windows, and the extra source code eliminates usage of the libm functions while having a code size very comparable to the Windows version, so it helps with the binary size on Linux a bit. (glibc libm really likes using IFUNCs, which require extra logic for resolving those, as compared to regular symbols, so not relying on these will hopefully pay off.) |
I meant the source code size, yeah. More source code means a higher maintenance burden. Usually, it's hard to beat an import in terms of size, at least for 64k targets, where you'll likely do a bunch of CRT import anyway. But I guess the IFUNC stuff throws a wrench in that plan. But perhaps using the |
|
I'm not sure I understand the problem. All we need is exp2 and cos. Both of these seems to behave just fine on both x86 and x64 as far as I can tell (they simply seem to import the libm-functions, and call them normally). What happens for other math functions and for other platforms here is quite irrelevant, no? |
Define "behave just fine"? The libm versions are both IFUNCs, and the
#include <stdio.h>
int main(int argc, char* argv[]) {
printf("cos(%d) = %f\n", argc, __builtin_cos/*or exp2*//*with or without f*/(argc));
return 0;
}
I suppose |
Results in plain function calls to "normal" imported functions. Yes, they are IFUNCs, but I don't see any resolve logic injected into our binary for that. Maybe I'm missing something here? I guess with this stuff we can avoid having libm in the import-table altogether, which might be a win in some cases. I'm still not convinced it ends up as a net win in a real-world 64k intro, though; those will probably benefit from an import of libm as they are more likely to actually do some more CPU work. But I guess that depends on the intro. |
In normal, non-minified binaries, you indeed don't have to include any extra resolving logic, as ld.so does that for you. However, executable packers for Linux such as smol and dnload both have to deal with IFUNCs, and that code does end up in your final executable, because they circumvent a number of ld.so things in order to avoid having to include a symbol table and symbol names (they're hash-based) in the first place. |
OK, that makes a bit of sense. But, it kinda seems like a slippery slope, really. Because this can be the case for pretty much any CRT function you're using, so you might have to replace a lot of functions in the end. It seems like it's better to fix this in the packers to me... |
Also, just want to make it clear; I'm not trying to throw doubt at this solution, just having a discussion here. It's probably not a bad idea for WaveSabre to not depend on libm.so at all, which I believe this MR solves. |
That's true, which is why smol has an option to enable/disable IFUNC support. If it can be left out, that's a few bytes gained. And indeed, right now the only external symbols I can see that are being used (in Core) are |
Just want to mention an alternative for cos: I did another attempt at dropping the fpuCos-helper in the past, which you can see here. In the end it didn't work out, because the resulting code actually got slightly larger. Not sure if that's the case on other compilers, though. |
This one should be usable as-is, but maybe the pow functions should be reworked a bit to work with LTO (cf. the cos impl.)
Also, do you want #37 to be included here?rebased on top of that one.