Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend onnxruntime gpu interface to producers using onnxruntime #39402

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions PhysicsTools/ONNXRuntime/BuildFile.xml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
<use name="onnxruntime"/>
<use name="FWCore/Utilities"/>
<use name="HeterogeneousCore/CUDAServices" source_only="1"/>
<use name="FWCore/ServiceRegistry" source_only="1"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both of these should be without the source_only as the shared libraries are needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps my understanding of what source_only means here is wrong.. I used this to mean that only the source (eg, the interface) of PhysicsTools/ONNXRuntime needs these packages (eg, I was trying to avoid adding a library dependency that wasn't already there) - do I have that backwards? (@smuzaffar)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source_only means the package only depend on the headers of its dependent package and there is no linking required. This flag was not need as headers of dependent packages were always available to include but we added this for cxxmodules IBs so that scram can properly build the dependent package module first. And no, there is no backward of this supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - thanks - I've updated accordingly..

<export>
<lib name="1"/>
</export>
32 changes: 32 additions & 0 deletions PhysicsTools/ONNXRuntime/interface/ONNXSessionOptions.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#ifndef PHYSICSTOOLS_ONNXRUNTIME_ONNXSESSIONOPTIONS_H
#define PHYSICSTOOLS_ONNXRUNTIME_ONNXSESSIONOPTIONS_H

#include "HeterogeneousCore/CUDAServices/interface/CUDAService.h"
#include "FWCore/ServiceRegistry/interface/Service.h"
#include "ONNXRuntime.h"
davidlange6 marked this conversation as resolved.
Show resolved Hide resolved
#include "onnxruntime/core/session/onnxruntime_cxx_api.h"
#include <string>

namespace cms::Ort {

// param_backend
// cpu -> Use CPU backend
// cuda -> Use cuda backend
// default -> Use best available
inline ::Ort::SessionOptions getSessionOptions(const std::string &param_backend) {
auto backend = cms::Ort::Backend::cpu;
if (param_backend == "cuda")
backend = cms::Ort::Backend::cuda;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what should happen if process.options.accelerators = ["cpu"] on a machine with a GPU and the job as an ONNX module explicitly configured to use cuda. I suppose in general we'd want such a setup to not run any GPU code (this is what happens with SwitchProducerCUDA and in all CUDA EDModules).

If we want ONNX modules to also avoid using GPUs when process.options.accelerators = ["cpu"], the case of explicit cuda choice should also check if the CUDAService is enabled, and throw an exception if it is not.

(and in the longer term we should think of how to improve the mechanism)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also encountered this ONNX issue in SONIC tests. I think it's microsoft/onnxruntime#12321. There's a fix merged, but not in a release yet.

ah - great - I saw the issue at some point, but missed that it was fixed (as the issue was open...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidlange6 Here is the code snipped I mentioned in order to make the "param_backend == "cuda" case to fail the job for the process.options.accelerators = ["cpu"] case

if (param_backend == "cuda") {
  edm::Service<CUDAService> cs;
  if (cs.isAvailable() and cs->enabled()) {
    backend = cms::Ort::Backend::cuda;
  } else {
    edm::Exception ex(edm::errors::UnavailableAccelerator);
    ex << "cuda backend requested, but no NVIDIA GPU available in the job";
    ex.addContext("Calling cms::Ort::getSessionOptions()");
    throw ex;
  }
}

(although I have a feeling there is room for future simplification in conjunction with the default case handling below)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we proceed here? Improve the logic now, or merge this PR and make a quick follow-up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I thought I had written this comment already earlier, but apparently not)

We could also make the code here less dependent on the availability of CUDA runtime in the CMSSW build by using edm::ResourceInformation service along

if (param_backend == "cuda") {
  edm::Service<edm::ResourceInformation> ri;
  if (not ri.nvidiaDriverVersion.empty()) {
    backend = cms::Ort::Backend::cuda;
  } else {
    edm::Exception ex(edm::errors::UnavailableAccelerator);
    ex << "cuda backend requested, but no NVIDIA GPU available in the job";
    ex.addContext("Calling cms::Ort::getSessionOptions()");
    throw ex;
  }
}

if (param_backend == "default") {
  edm::Service<edm::ResourceInformation> ri;
  if (not ri->nvidiaDriverVersion().empty()) {
    backend = cms::Ort::Backend::cuda;
  }
}

(although we should probably think of better API in edm::ResourceInformation for this use case, e.g. the number of devices)


if (param_backend == "default") {
edm::Service<CUDAService> cs;
if (cs.isAvailable() and cs->enabled()) {
backend = cms::Ort::Backend::cuda;
}
}

return ONNXRuntime::defaultSessionOptions(backend);
}
} // namespace cms::Ort

#endif
9 changes: 8 additions & 1 deletion PhysicsTools/ONNXRuntime/src/ONNXRuntime.cc
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,11 @@ namespace cms::Ort {

using namespace ::Ort;

#ifdef ONNXDebug
const Env ONNXRuntime::env_(ORT_LOGGING_LEVEL_INFO, "");
#else
const Env ONNXRuntime::env_(ORT_LOGGING_LEVEL_ERROR, "");
#endif

ONNXRuntime::ONNXRuntime(const std::string& model_path, const SessionOptions* session_options) {
// create session
Expand Down Expand Up @@ -80,10 +84,12 @@ namespace cms::Ort {
SessionOptions sess_opts;
sess_opts.SetIntraOpNumThreads(1);
if (backend == Backend::cuda) {
// https://www.onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html
OrtCUDAProviderOptions options;
sess_opts.AppendExecutionProvider_CUDA(options);
}
#ifdef ONNX_PROFILE
sess_opts.EnableProfiling("ONNXProf");
#endif
return sess_opts;
}

Expand Down Expand Up @@ -140,6 +146,7 @@ namespace cms::Ort {
}

// run

auto output_tensors = session_->Run(RunOptions{nullptr},
input_node_names_.data(),
input_tensors.data(),
Expand Down
12 changes: 10 additions & 2 deletions RecoBTag/ONNXRuntime/plugins/BoostedJetONNXJetTagsProducer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
#include "DataFormats/BTauReco/interface/DeepBoostedJetTagInfo.h"

#include "PhysicsTools/ONNXRuntime/interface/ONNXRuntime.h"

#include "PhysicsTools/ONNXRuntime/interface/ONNXSessionOptions.h"
#include "RecoBTag/FeatureTools/interface/deep_helpers.h"

#include <iostream>
Expand Down Expand Up @@ -126,12 +126,20 @@ void BoostedJetONNXJetTagsProducer::fillDescriptions(edm::ConfigurationDescripti
"probQCDothers",
});
desc.addOptionalUntracked<bool>("debugMode", false);
desc.add<std::string>("onnx_backend", "default");

descriptions.addWithDefaultLabel(desc);
}

std::unique_ptr<ONNXRuntime> BoostedJetONNXJetTagsProducer::initializeGlobalCache(const edm::ParameterSet &iConfig) {
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath());
std::string backend = iConfig.getParameter<std::string>("onnx_backend");

auto session_options = cms::Ort::getSessionOptions(backend);
// Sept 8, 2022 - on gpu, this model crashes with all optimizations on
if (backend != "cpu")
session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_BASIC);
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath(),
&session_options);
}

void BoostedJetONNXJetTagsProducer::globalEndJob(const ONNXRuntime *cache) {}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
#include "DataFormats/BTauReco/interface/DeepFlavourTagInfo.h"

#include "PhysicsTools/ONNXRuntime/interface/ONNXRuntime.h"

#include "PhysicsTools/ONNXRuntime/interface/ONNXSessionOptions.h"
#include "RecoBTag/ONNXRuntime/interface/tensor_fillers.h"
#include "RecoBTag/ONNXRuntime/interface/tensor_configs.h"

Expand Down Expand Up @@ -136,12 +136,15 @@ void DeepCombinedONNXJetTagsProducer::fillDescriptions(edm::ConfigurationDescrip
desc.add<std::vector<std::string>>("flav_names", std::vector<std::string>{"probb", "probc", "probuds", "probg"});
desc.add<double>("min_jet_pt", 15.0);
desc.add<double>("max_jet_eta", 2.5);
desc.add<std::string>("onnx_backend", "default");

descriptions.add("pfDeepCombinedJetTags", desc);
}

std::unique_ptr<ONNXRuntime> DeepCombinedONNXJetTagsProducer::initializeGlobalCache(const edm::ParameterSet& iConfig) {
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath());
auto session_options = cms::Ort::getSessionOptions(iConfig.getParameter<std::string>("onnx_backend"));
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath(),
&session_options);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really for this PR, but I think the ONNXRuntime constructor interface would be cleaner if it would take session_options by value or by const reference.

}

void DeepCombinedONNXJetTagsProducer::globalEndJob(const ONNXRuntime* cache) {}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
#include "DataFormats/BTauReco/interface/DeepDoubleXTagInfo.h"

#include "PhysicsTools/ONNXRuntime/interface/ONNXRuntime.h"

#include "PhysicsTools/ONNXRuntime/interface/ONNXSessionOptions.h"
#include <algorithm>
#include <iostream>
#include <fstream>
Expand Down Expand Up @@ -121,6 +121,8 @@ void DeepDoubleXONNXJetTagsProducer::fillDescriptions(edm::ConfigurationDescript
"CvB" >> (PDPSD("flav_names", std::vector<std::string>{"probHbb", "probHcc"}, true) and
PDFIP("model_path", FIP("RecoBTag/Combined/data/DeepDoubleX/94X/V01/DDCvB.onnx"), true));
};
desc.add<std::string>("onnx_backend", "default");

auto descBvL(desc);
descBvL.ifValue(edm::ParameterDescription<std::string>("flavor", "BvL", true), flavorCases());
descriptions.add("pfDeepDoubleBvLJetTags", descBvL);
Expand All @@ -135,7 +137,9 @@ void DeepDoubleXONNXJetTagsProducer::fillDescriptions(edm::ConfigurationDescript
}

std::unique_ptr<ONNXRuntime> DeepDoubleXONNXJetTagsProducer::initializeGlobalCache(const edm::ParameterSet& iConfig) {
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath());
auto session_options = cms::Ort::getSessionOptions(iConfig.getParameter<std::string>("onnx_backend"));
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath(),
&session_options);
}

void DeepDoubleXONNXJetTagsProducer::globalEndJob(const ONNXRuntime* cache) {}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include "DataFormats/BTauReco/interface/DeepFlavourTagInfo.h"

#include "PhysicsTools/ONNXRuntime/interface/ONNXRuntime.h"
#include "PhysicsTools/ONNXRuntime/interface/ONNXSessionOptions.h"

using namespace cms::Ort;

Expand Down Expand Up @@ -84,12 +85,15 @@ void DeepFlavourONNXJetTagsProducer::fillDescriptions(edm::ConfigurationDescript
desc.add<std::vector<std::string>>("output_names", {"ID_pred/Softmax:0"});
desc.add<std::vector<std::string>>(
"flav_names", std::vector<std::string>{"probb", "probbb", "problepb", "probc", "probuds", "probg"});
desc.add<std::string>("onnx_backend", "default");

descriptions.add("pfDeepFlavourJetTags", desc);
}

std::unique_ptr<ONNXRuntime> DeepFlavourONNXJetTagsProducer::initializeGlobalCache(const edm::ParameterSet& iConfig) {
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath());
auto session_options = cms::Ort::getSessionOptions(iConfig.getParameter<std::string>("onnx_backend"));
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath(),
&session_options);
}

void DeepFlavourONNXJetTagsProducer::globalEndJob(const ONNXRuntime* cache) {}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include "DataFormats/BTauReco/interface/DeepFlavourTagInfo.h"

#include "PhysicsTools/ONNXRuntime/interface/ONNXRuntime.h"
#include "PhysicsTools/ONNXRuntime/interface/ONNXSessionOptions.h"

#include "RecoBTag/ONNXRuntime/interface/tensor_fillers.h"
#include "RecoBTag/ONNXRuntime/interface/tensor_configs.h"
Expand Down Expand Up @@ -111,12 +112,15 @@ void DeepVertexONNXJetTagsProducer::fillDescriptions(edm::ConfigurationDescripti
desc.add<std::vector<std::string>>("flav_names", std::vector<std::string>{"probb", "probc", "probuds", "probg"});
desc.add<double>("min_jet_pt", 15.0);
desc.add<double>("max_jet_eta", 2.5);
desc.add<std::string>("onnx_backend", "default");

descriptions.add("pfDeepVertexJetTags", desc);
}

std::unique_ptr<ONNXRuntime> DeepVertexONNXJetTagsProducer::initializeGlobalCache(const edm::ParameterSet& iConfig) {
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath());
auto session_options = cms::Ort::getSessionOptions(iConfig.getParameter<std::string>("onnx_backend"));
return std::make_unique<ONNXRuntime>(iConfig.getParameter<edm::FileInPath>("model_path").fullPath(),
&session_options);
}

void DeepVertexONNXJetTagsProducer::globalEndJob(const ONNXRuntime* cache) {}
Expand Down
5 changes: 4 additions & 1 deletion RecoParticleFlow/PFProducer/plugins/MLPFProducer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

#include "DataFormats/ParticleFlowCandidate/interface/PFCandidate.h"
#include "PhysicsTools/ONNXRuntime/interface/ONNXRuntime.h"
#include "PhysicsTools/ONNXRuntime/interface/ONNXSessionOptions.h"
#include "RecoParticleFlow/PFProducer/interface/MLPFModel.h"

#include "DataFormats/ParticleFlowReco/interface/PFBlockElementTrack.h"
Expand Down Expand Up @@ -160,7 +161,8 @@ void MLPFProducer::produce(edm::Event& event, const edm::EventSetup& setup) {
}

std::unique_ptr<ONNXRuntime> MLPFProducer::initializeGlobalCache(const edm::ParameterSet& params) {
return std::make_unique<ONNXRuntime>(params.getParameter<edm::FileInPath>("model_path").fullPath());
auto session_options = cms::Ort::getSessionOptions(params.getParameter<std::string>("onnx_backend"));
return std::make_unique<ONNXRuntime>(params.getParameter<edm::FileInPath>("model_path").fullPath(), &session_options);
}

void MLPFProducer::globalEndJob(const ONNXRuntime* cache) {}
Expand All @@ -173,6 +175,7 @@ void MLPFProducer::fillDescriptions(edm::ConfigurationDescriptions& descriptions
edm::FileInPath(
"RecoParticleFlow/PFProducer/data/mlpf/"
"mlpf_2021_11_16__no_einsum__all_data_cms-best-of-asha-scikit_20211026_042043_178263.workergpu010.onnx"));
desc.add<std::string>("onnx_backend", "default");
descriptions.addWithDefaultLabel(desc);
}

Expand Down