Skip to content

Latest commit

 

History

History
111 lines (96 loc) · 5.7 KB

model_support.md

File metadata and controls

111 lines (96 loc) · 5.7 KB

Model Support and Limitations

The FIL backend is designed to accelerate inference for tree-based models. If the model you are trying to deploy is not tree-based, consider using one of Triton's other backends.

Frameworks

The FIL backend supports most XGBoost and LightGBM models using their native serialization formats. The FIL backend also supports the following model types from Scikit-Learn and cuML using Treelite's checkpoint serialization format:

  • GradientBoostingClassifier
  • GradientBoostingRegressor
  • IsolationForest
  • RandomForestRegressor
  • ExtraTreesClassifier
  • ExtraTreesRegressor

In addition, the FIL backend can perform inference on tree models from any framework if they are first exported to Treelite's checkpoint serialization format.

Serialization Formats

The FIL backend currently supports the following serialization formats:

  • XGBoost JSON (Version < 1.7)
  • XGBoost Binary
  • LightGBM Text
  • Treelite binary checkpoint

The FIL backend does not support direct ingestion of Pickle files. The pickled model must be converted to one of the above formats before it can be used in Triton.

Version Compatibility

Until version 3.0 of Treelite, Treelite offered no backward compatibility for its checkpoint format even among minor releases. Therefore, the version of Treelite used to save a checkpoint had to exactly match the version used in the FIL backend. Starting with version 3.0, Treelite supports checkpoint output from any version of Treelite starting with 2.7 until the next major release.

XGBoost's JSON format also changes periodically between minor versions, and older versions of Treelite used in the FIL backend may not support those changes.

The compatibility matrix for Treelite and XGBoost with the FIL backend is shown below:

Triton Version Supported Treelite Version(s) Supported XGBoost JSON Version(s)
21.08 1.3.0 <1.6
21.09-21.10 2.0.0 <1.6
21.11-22.02 2.1.0 <1.6
22.03-22.06 2.3.0 <1.6
22.07 2.4.0 <1.7
22.08-24.02 2.4.0; >=3.0.0,<4.0.0 <1.7
24.03+ 3.9.0; >=4.0.0,<5.0.0 1.7+

Limitations

The FIL backend currently does not support any multi-output regression models.

Double-Precision Support

While the FIL backend can load double-precision models, it performs all computations in single-precision mode. This can lead to slight differences in model output for frameworks like LightGBM which natively use double precision. Support for double-precision execution is planned for an upcoming release.

Categorical Feature Support

As of version 21.11, the FIL backend includes support for models with categorical features (e.g. some XGBoost and LightGBM ) models. These models can be deployed just like any other model, but it is worth remembering that (as with any other inference pipeline which includes categorical features), care must be taken to ensure that the categorical encoding used during inference matches that used during training. If the data passed through at inference time does not contain all of the categories used during training, there is no way to reconstruct the correct mapping of features, so some record must be made of the complete set of categories used during training. With that record, categorical columns can be appropriately converted to float32 columns, and submitted to Triton as with any other input.

For a fully-worked example of using a model with categorical features, check out the introductory fraud detection notebook.