Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for importing symbols from static library #902

Open
wants to merge 41 commits into
base: main
Choose a base branch
from

Conversation

testhound
Copy link

This pull request adds support for importing symbols from a static library into the execution engine.

@testhound testhound changed the title Testhound/add archive Add support for importing symbols from static library Feb 3, 2023
Copy link
Member

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a quick skim (not a proper review) and added some comments on the diff. I haven't yet considered whether the strategy of compiling and archiving during the test run is the best path, but I've added a couple of pointers on the test code anyway.


# Create compiler with default options
c = new_compiler()
workdir = os.path.dirname(llvmlite.tests.__file__) + "/"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't be assumed that the llvmlite test dir is writable (and I don't think it's good form to drop files into a package dir as part of running the tests anyway). If the test dir isn't writable then the test errors:

======================================================================
ERROR: test_add_archive (llvmlite.tests.test_binding.TestArchiveFile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gmarkall/numbadev/llvmlite/llvmlite/tests/test_binding.py", line 1969, in test_add_archive
    c.create_static_lib(objects, library_name, output_dir=workdir)
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/site-packages/setuptools/_distutils/unixccompiler.py", line 131, in create_static_lib
    self.spawn(self.archiver +
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/site-packages/setuptools/_distutils/ccompiler.py", line 917, in spawn
    spawn(cmd, dry_run=self.dry_run, **kwargs)
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/site-packages/setuptools/_distutils/spawn.py", line 68, in spawn
    raise DistutilsExecError(
distutils.errors.DistutilsExecError: command '/opt/binutils/2.37/bin/ar' failed with exit code 1

----------------------------------------------------------------------

self.assertEqual(mac_func(10, 10, 20), 120)
self.assertEqual(msub_func(10, 10, 20), 80)

os.unlink(objects[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the test fails then this leaves files on disk. The tempfile module has context managers for ensuring cleanup: https://docs.python.org/3/library/tempfile.html

Add the symbols from the specified static archive file to the execution
engine. It is a fatal error in LLVM if the *archive_file* does not exist.

* *archive* str: a path to the static object file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter name doesn't match that from above. Is "static object file" appropriate here, or is "archive file" more consistent?

Suggested change
* *archive* str: a path to the static object file
* *archive_file* str: path to the archive file

@@ -79,6 +79,13 @@ The ExecutionEngine class
or a object file instance. Object file instance is not usable after this
call.

* .. method:: add_archive(archive_file)

Add the symbols from the specified static archive file to the execution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the word "static" mean here? My understanding is that in general, an archive contains unlinked objects, and those objects might eventually take part in a static or dynamic linking process (depending on how they were compiled, of course).

According to the LLVM docs, it looks like adding an archive uses

Suggested change
Add the symbols from the specified static archive file to the execution
Add the symbols from the specified archive file to the execution

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmarkall static here means static archive (i.e libc.a) vs shared library (libc.so). I double checked the code on the LLVM C++ side of things and it is only expecting a static archive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like somehow my comment got truncated, apologies - what I'm trying to suggest is that "static archive" feels like a weird term - it's an archive file of objects, which will probably be linked statically. So either "archive file" or "static library" seem like appropriate terms (but "archive file" is better here since the method name refers to archives), but "static archive" is a mashup of the two, which I can't find any reference to in common use.

…emporary directory and cleanup. Also fix typos in docs.
@testhound
Copy link
Author

@gmarkall I made fixes/updates for all your comments.

@sklam
Copy link
Member

sklam commented Feb 14, 2023

cc1: fatal error: /home/vsts/miniconda3/envs/cienv/lib/python3.8/site-packages/llvmlite/tests/a.c: No such file or directory

see error log at https://dev.azure.com/numba/numba/_build/results?buildId=14315&view=logs&j=fd18f1e2-1ae3-5f4d-9319-996752c1f67c&t=85c24074-1b78-5ee7-1f6d-6e0b210c5911&l=27

You'll need to add the .c files as package data. They are not being installed.

@testhound
Copy link
Author

@sklam and @gmarkall all errors resolved.

}

Expected<std::unique_ptr<object::Archive>> ArchiveOrError =
object::Archive::create(ArBufOrErr.get()->getMemBufferRef());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The llvm::object namespace is already visible in this scope:

Suggested change
object::Archive::create(ArBufOrErr.get()->getMemBufferRef());
Archive::create(ArBufOrErr.get()->getMemBufferRef());

Copy link
Member

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a look at the C++ side of things, which seems mostly OK but I've added some comments on there.

Comment on lines 201 to 202
object::OwningBinary<object::Archive> owningBinaryArchive(std::move(Ar),
std::move(ArBuf));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
object::OwningBinary<object::Archive> owningBinaryArchive(std::move(Ar),
std::move(ArBuf));
OwningBinary<object::Archive> owningBinaryArchive(std::move(Ar),
std::move(ArBuf));

object::OwningBinary<object::Archive> owningBinaryArchive(std::move(Ar),
std::move(ArBuf));
engine->addArchive(std::move(owningBinaryArchive));
*OutError = LLVMPY_CreateString("");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to create an empty string if we're returning success.

Suggested change
*OutError = LLVMPY_CreateString("");

@@ -79,6 +79,13 @@ The ExecutionEngine class
or a object file instance. Object file instance is not usable after this
call.

* .. method:: add_archive(archive_file)

Add the symbols from the specified static archive file to the execution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like somehow my comment got truncated, apologies - what I'm trying to suggest is that "static archive" feels like a weird term - it's an archive file of objects, which will probably be linked statically. So either "archive file" or "static library" seem like appropriate terms (but "archive file" is better here since the method name refers to archives), but "static archive" is a mashup of the two, which I can't find any reference to in common use.

using namespace llvm::object;
auto engine = unwrap(EE);

ErrorOr<std::unique_ptr<MemoryBuffer>> ArBufOrErr =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm a little lazy (as a C++ programmer) but it seems like we could simplify some of these declarations by using auto a bit more. I don't have strong feelings either way, but tend to prefer it when I'm writing.

Copy link
Author

@testhound testhound Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmarkall it's up to you, I find "auto" can obfuscate in a situation like this especially when you need to determine what to unpack such as getting the error message.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmarkall to your comment "... which will probably be linked statically", a static archive can only be linked statically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also prefer these to consistently be auto given some assignments in the function are auto

Comment on lines 193 to 195
std::string archiveErrStr =
"Unable to load archive: " + std::string(ArchiveName);
*OutError = LLVMPY_CreateString(archiveErrStr.c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than providing a generic error message, passing along the error provided by LLVM will probably be more helpful to the user. I'm not sure of the cleanest way to do it (I often have some trouble picking and choosing between LLVM C++ and C APIs, for example, but here's one way to do it:

Suggested change
std::string archiveErrStr =
"Unable to load archive: " + std::string(ArchiveName);
*OutError = LLVMPY_CreateString(archiveErrStr.c_str());
LLVMErrorRef errorRef = wrap(std::move(takeErr));
*OutError = LLVMPY_CreateString(LLVMGetErrorMessage(errorRef));

@testhound
Copy link
Author

@gmarkall all feedback incorporated.

@testhound
Copy link
Author

@gmarkall ready for review.

@gmarkall gmarkall self-assigned this Mar 6, 2023
@sklam sklam added this to the v0.41.0 milestone Mar 9, 2023
using namespace llvm::object;
auto engine = unwrap(EE);

ErrorOr<std::unique_ptr<MemoryBuffer>> ArBufOrErr =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also prefer these to consistently be auto given some assignments in the function are auto

MemoryBuffer::getFile(ArchiveName);

std::error_code EC = ArBufOrErr.getError();
if (EC) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more idiomatically written as:

if (!ArBufOrErr) {... }

without the creation of EC.


if (!ArchiveOrError) {
auto takeErr = ArchiveOrError.takeError();
LLVMErrorRef errorRef = wrap(std::move(takeErr));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copy and move can be collapsed using wrap(ArchiveOrError.takeError())

return 1;
}

std::unique_ptr<MemoryBuffer> &ArBuf = ArBufOrErr.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can also be simplified to:

    OwningBinary<object::Archive> owningBinaryArchive(std::move(*ArchiveOrError),
                                                      std::move(*ArBufOrErr));

c = new_compiler()
customize_compiler(c)
workdir = tmpdir.name
srcdir = os.path.dirname(llvmlite.tests.__file__) + "/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use os.path.join to make this portable and same for the path manipulation below and don't add the .c files to the repository. Store the code as inline strings and write them to a temporary directory during the test.

setup.py Outdated
@@ -100,6 +100,7 @@ def run(self):
from llvmlite.utils import get_library_files
self.distribution.package_data = {
"llvmlite.binding": get_library_files(),
"llvmlite.tests": ["*.c"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you inline the .c files, revert these changes.

@testhound
Copy link
Author

@apmasell and @gmarkall I have updated the code. I am skipping the test on Windows and provided a detailed comment at the top of the test class. There is a error on macOS/Python 3.9 that appears related to a missing llvm-ar. All other versions of macOS succeed.

@sklam
Copy link
Member

sklam commented May 11, 2023

Note for 269cd6d
The Anaconda build of Python 3.9 is using a hardcoded path for llvm-ar from the build environment. I tested it on conda-forge build of Python 3.9 and it works without problem.

@testhound
Copy link
Author

@apmasell and @gmarkall ready for review.

@apmasell
Copy link
Contributor

If there's a way to get the binary to load on Windows using a non-distutils tool chain, that should be done rather than skipping the test.

Comment on lines +2009 to +2011
# Skip this test on Windows due to the fact that
# the /GL cross-module optimization has the side-effect
# of removing external symbols in this test case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe adding a DEF file can fix this; see https://devblogs.microsoft.com/oldnewthing/20140321-00/?p=1433

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sklam I have explored using a DEF file, however DEF files can only be provided to the "LINK" command for creating a dynamic library not the 'LIB" command that creates a static library. Additionally I have not found any way to modify the command line options to the compiler via the compile options. If I could modify the command line I would simply remove the "/GL" option.

@testhound
Copy link
Author

If there's a way to get the binary to load on Windows using a non-distutils tool chain, that should be done rather than skipping the test.

Are there other

If there's a way to get the binary to load on Windows using a non-distutils tool chain, that should be done rather than skipping the test.

@apmasell unless you are aware of another solution it appears I will need to invoke the visual studio c++ compiler directly from the Python code for Windows plaltforms.

@testhound
Copy link
Author

@apmasell, @sklam and @gmarkall Windows error resolved and ready for review.

@gmarkall gmarkall modified the milestones: v0.41.0rc1, PR Backlog Jul 4, 2023
@esc esc removed this from the PR Backlog milestone Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants