Name		Name	Last commit message	Last commit date
parent directory ..
CMakeLists.txt		CMakeLists.txt
README.md		README.md
nonZeroKernel.cu		nonZeroKernel.cu
nonZeroKernel.h		nonZeroKernel.h
sampleNonZeroPlugin.cpp		sampleNonZeroPlugin.cpp

README.md

NonZero Plugin for TensorRT using IPluginV3

Table Of Contents

Description
How does this sample work?
Running the sample
- Sample --help options
Additional resources
License
Changelog
Known issues

Description

This sample, sampleNonZeroPlugin, implements a plugin for the NonZero operation, customizable to output the non-zero indices in either a row order (each set of indices in the same row) or column order format (each set of indices in the same column).

NonZero is an operation where the non-zero indices of the input tensor is found.

How does this sample work?

This sample creates and runs a TensorRT engine built from a network containing a single NonZeroPlugin node. It demonstrates how custom layers with data-dependent output shapes can be implemented and added to a TensorRT network.

Specifically, this sample:

Implements a TensorRT plugin for the NonZero operation
Creates a network and builds an engine
Runs inference using the generated TensorRT network

Implementing a NonZero plugin using IPluginV3 interface

Until IPluginV3 (and associated interfaces), TensorRT plugins could not have outputs whose shapes depended on the input values (they could only depend on input shapes). IPluginV3OneBuild which exposes a build capability for IPluginV3, provides support for such data-dependent output shapes.

NonZeroPlugin in this sample is written to handle 2-D input tensors of shape $R \times C$. Assume that the tensor contains $K$ non-zero elements and that the non-zero indices are required in a row ordering (each set of indices in its own row). Then the output shape would be $K \times 2$.

The output shapes are expressed to the TensorRT builder through the IPluginV3OneBuild::getOutputShapes() API. Expressing the second dimension of the output is straightforward:

outputs[0].d[1] = exprBuilder.constant(2);

The extent of each data-dependent dimension in the plugin must be expressed in terms of a size tensor. A size tensor is a scalar output of DataType::kINT32 or DataType::kINT64 that must be added as one of the plugin outputs. In this case, it is sufficient to declare one size tensor to denote the extent of the first dimension of the non-zero indices output. To declare a size tensor, one must provide an upper-bound and optimum value for its extent as IDimensionExprs. These can be formed through the IExprBuilder argument passed to the IPluginV3OneBuild::getOutputShapes() method.

For unknown inputs, the upper-bound is the total number of elements in the input

auto upperBound = exprBuilder.operation(DimensionOperation::kPROD, *inputs[0].d[0], *inputs[0].d[1]);

A good estimate for the optimum is that half of the elements are non-zero

auto optValue = exprBuilder.operation(DimensionOperation::kFLOOR_DIV, *upperBound, *exprBuilder.constant(2));

Now we can declare the size tensor using the IExprBuilder::declareSizeTensor() method, which also requires the specification of the output index at which the size tensor would reside. Let us place it after the non-zero indices output:

auto numNonZeroSizeTensor = exprBuilder.declareSizeTensor(1, *optValue, *upperBound);

Now we are ready to specify the extent of the first dimension of the non-zero indices output:

outputs[0].d[0] = numNonZeroSizeTensor;

and let's not forget to declare that the size tensor is a scalar (0-D):

outputs[1].nbDims = 0;

The NonZeroPlugin can also be configured to emit the non-zero indices in a column-order fashion through the rowOrder plugin attribute, by setting it to 0. In this case, the first output of the plugin will have shape $2 \times K$, and the output shape specification must be adjusted accordingly.

Creating network and building the engine

To add the plugin to the network, the INetworkDefinition::addPluginV3() method must be used.

Similar to IPluginCreator used for V2 plugins, V3 plugins must be accompanied by the registration of a plugin creator implementing the IPluginCreatorV3One interface.

Running inference

As sample inputs, random images from MNIST dataset are selected and scaled to between [0,1]. The network will output both the non-zero indices, as well as the non-zero count.

Preparing sample data

Download the sample data from the TensorRT release tarball.

Running the sample

Compile the sample by following build instructions in TensorRT README.

Run the sample to build and run the MNIST engine from the ONNX model.

./sample_non_zero_plugin [-h or --help] [-d or --datadir=<path to data directory>] [--columnOrder] [--fp16]

Verify that the sample ran successfully. If the sample runs successfully you should see output similar to the following:

&&&& RUNNING TensorRT.sample_non_zero_plugin # ./sample_non_zero_plugin
...
[I] Input:
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.854902, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.858824, 0, 0, 0.0745098, 0, 0.564706, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.317647, 0, 0, 0.47451, 0, 0, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0431373, 0, 0, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.854902, 0, 0, 0.145098
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.564706, 0, 0, 0.996078
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.282353
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.854902
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.854902, 0, 0, 0.145098, 0, 0.564706
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.564706, 0, 0, 0.996078, 0, 0
[I] 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.282353, 0, 0
[I]
[I] Output:
[I] 2 14
[I] 3 9
[I] 3 12
[I] 3 14
[I] 4 9
[I] 4 12
[I] 5 12
[I] 8 12
[I] 8 15
[I] 9 12
[I] 9 15
[I] 10 15
[I] 13 15
[I] 14 10
[I] 14 13
[I] 14 15
[I] 15 10
[I] 15 13
[I] 16 13
&&&& PASSED TensorRT.sample_non_zero_plugin # ./sample_non_zero_plugin

Sample `--help` options

To see the full list of available options and their descriptions, use the -h or --help command line option.

Additional resources

The following resources provide a deeper understanding about the V3 TensorRT plugins and the NonZero operation:

NonZero

ONNX: NonZero

TensorRT plugins

Extending TensorRT with Custom Layers

Other documentation

License

For terms and conditions for use, reproduction, and distribution, see the TensorRT Software License Agreement documentation.

Changelog

March 2024 This is the first version of this README.md file.

Known issues

There are no known issues in this sample.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sampleNonZeroPlugin

sampleNonZeroPlugin

README.md

NonZero Plugin for TensorRT using IPluginV3

Description

How does this sample work?

Implementing a NonZero plugin using IPluginV3 interface

Creating network and building the engine

Running inference

Preparing sample data

Running the sample

Sample `--help` options

Additional resources

License

Changelog

Known issues

Files

sampleNonZeroPlugin

Directory actions

More options

Directory actions

More options

Latest commit

History

sampleNonZeroPlugin

Folders and files

parent directory

README.md

NonZero Plugin for TensorRT using IPluginV3

Description

How does this sample work?

Implementing a NonZero plugin using IPluginV3 interface

Creating network and building the engine

Running inference

Preparing sample data

Running the sample

Sample --help options

Additional resources

License

Changelog

Known issues

Sample `--help` options