-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing xsmm runner dynamic load #146
Conversation
Not sure what's the problem with Python. This is what fixed for us in tpp-mlir, but that was a c++ application, not a shared object. I have copied the
Note, library path is set correctly by running |
Looking more at this, Python can load the library:
But later on the JIT fails to find the symbols.
The way we fixed this in C++ was to pre-load the library onto the Unfortunately, looking at Orc (LLVM's JIT compiler), the error messages are triggered by helper classes, emitted by some other loader. I imagine openvino binary is the one that needs to load that library and tell the JIT where it it. |
… names of libs explicitly (FIXME)
#FIXME: Provide platform-independent way of doing that: | ||
install(FILES ${TPP_MLIR_DIR}/lib/libtpp_xsmm_runner_utils.so ${TPP_MLIR_DIR}/lib/libtpp_xsmm_runner_utils.so.19.0git DESTINATION ${OV_CPACK_RUNTIMEDIR}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rengolin, please suggest a proper alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually not a bad idea, tbh. An alternative is to change the setupvars.sh
to add the TPP build directory to the LD_LIBRARY_PATH
.
Unless TPP can be installed as a proper library (on system path), there's not much else we can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to have self-containing openvino package as it is now to model a final product without any extra dependencies. This is how binary size of the package will be calculated. This is one of the important product-level metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially intended to provide a proper cmake statement without all these .so
etc. things. Do we have a normal way to include TPP-MLIR with find_package
similar to what we have in LLVM/MLIR?
Now it works for Linux and C++ only. Needed libraries are installed in the target ov directory. |
How can I test this in C++? |
To test it in C++ you need to have two programs: one to emit OpenVINO IR with desired model (Python part that uses PyTorch), and second to run that IR in C++ application. It is not very convenient but you cannot convert PyTorch model in C++ app, Python is a requirement in this case. And now C++ is a requirement for xsmm runner part, so we need two programs. I would like to see a PR that shows how a library could be registered for JIT in MLIR/LLVM world and makes it functional for both Python and C++. The first program: import torch
import torch.nn as nn
import openvino as ov
# Define a synthetic model
class LinearModel(nn.Module):
def __init__(self, input_size, output_size):
super(LinearModel, self).__init__()
self.linear = nn.Linear(input_size, output_size)
def forward(self, a):
# some random element-wise stuff first just to see how it can be combined with MatMul
b = a*a + 2.0
x = ((a+a) * (a-b)) / a
out = self.linear(x)
return out
# Create an instance of the model
input_size = 1024
output_size = 128
model = LinearModel(input_size, output_size)
# Generate random weights
model.linear.weight.data.normal_(0, 0.01)
model.linear.bias.data.fill_(0.01)
input_data = torch.tensor(range(1, input_size*output_size+1)).to(torch.float32).view(output_size, input_size)
with torch.no_grad():
reference = model(input_data)
print('Reference:\n', reference)
ov_model = ov.convert_model(model, example_input=input_data)
ov.save_model(ov_model, "simple_model.matmul.1024x128.xml") The second program: #include <openvino/openvino.hpp>
int main () {
ov::Core core;
auto compiled_model = core.compile_model("simple_model.matmul.1024x128.xml");
auto infer_request = compiled_model.create_infer_request();
auto input_tensor_1 = infer_request.get_input_tensor(0);
size_t size1 = 128;
size_t size2 = 1024;
input_tensor_1.set_shape({size1, size2});
auto data_1 = input_tensor_1.data<float>();
for(size_t i = 0; i < size1*size2; ++i)
data_1[i] = i+1;
infer_request.infer();
auto output_tensor = infer_request.get_output_tensor(0);
auto output_data = output_tensor.data<float>();
for(size_t i = 0; i < output_tensor.get_size(); ++i) {
std::cout << "[" << i << "]: " << output_data[i] << "\n";
}
} You can build it with g++ example.cpp -I/where/openvino/installed/runtime/include -lopenvino -L/where/openvino/installed/runtime/lib/intel64 Apply |
Ok, I think we can go with that for now. The important process is:
Now we need a set of benchmarks:
Aiming for these performance targets:
Later on (or in parallel) we can work the Python issues, but these are not critical to demonstrate impact. |
TODO: Still doesn't work in Python.