Packages:
Package v1alpha2 contains API Schema definitions for the serving v1alpha2 API group
Resource Types:(Appears on: ExplainerSpec)
AlibiExplainerSpec defines the arguments for configuring an Alibi Explanation Server
Field | Description |
---|---|
type AlibiExplainerType |
The type of Alibi explainer |
storageUri string |
The location of a trained explanation model |
runtimeVersion string |
Alibi docker image version which defaults to latest release |
resources Kubernetes core/v1.ResourceRequirements |
Defaults to requests and limits of 1CPU, 2Gb MEM. |
config map[string]string |
Inline custom parameter settings for explainer |
(Appears on: AlibiExplainerSpec)
(Appears on: DeploymentSpec)
Batcher provides optional payload batcher for all endpoints
Field | Description |
---|---|
maxBatchSize int |
(Optional)
MaxBatchSize of batcher service |
maxLatency int |
(Optional)
MaxLatency of batcher service |
timeout int |
(Optional)
Timeout of batcher service |
EndpointStatusMap defines the observed state of InferenceService endpoints
(Appears on: ExplainerSpec, PredictorSpec, TransformerSpec)
CustomSpec provides a hook for arbitrary container configuration.
Field | Description |
---|---|
container Kubernetes core/v1.Container |
(Appears on: ExplainerSpec, PredictorSpec, TransformerSpec)
DeploymentSpec defines the configuration for a given InferenceService service component
Field | Description |
---|---|
serviceAccountName string |
(Optional)
ServiceAccountName is the name of the ServiceAccount to use to run the service |
minReplicas int |
(Optional)
Minimum number of replicas which defaults to 1, when minReplicas = 0 pods scale down to 0 in case of no traffic |
maxReplicas int |
(Optional)
This is the up bound for autoscaler to scale to |
parallelism int |
Parallelism specifies how many requests can be processed concurrently, this sets the hard limit of the container concurrency(https://knative.dev/docs/serving/autoscaling/concurrency). |
logger Logger |
(Optional)
Activate request/response logging |
batcher Batcher |
(Optional)
Activate request batching |
(Appears on: InferenceServiceSpec)
Field | Description |
---|---|
predictor PredictorSpec |
Predictor defines the model serving spec |
explainer ExplainerSpec |
(Optional)
Explainer defines the model explanation service spec, explainer service calls to predictor or transformer if it is specified. |
transformer TransformerSpec |
(Optional)
Transformer defines the pre/post processing before and after the predictor call, transformer service calls to predictor service. |
(Appears on: ExplainersConfig)
Field | Description |
---|---|
image string |
|
defaultImageVersion string |
(Appears on: EndpointSpec)
ExplainerSpec defines the arguments for a model explanation server, The following fields follow a “1-of” semantic. Users must specify exactly one spec.
Field | Description |
---|---|
alibi AlibiExplainerSpec |
Spec for alibi explainer |
custom CustomSpec |
Spec for a custom explainer |
DeploymentSpec DeploymentSpec |
(Members of |
(Appears on: InferenceServicesConfig)
Field | Description |
---|---|
alibi ExplainerConfig |
InferenceService is the Schema for the services API
Field | Description | ||||||
---|---|---|---|---|---|---|---|
metadata Kubernetes meta/v1.ObjectMeta |
Refer to the Kubernetes API documentation for the fields of the
metadata field.
|
||||||
spec InferenceServiceSpec |
|
||||||
status InferenceServiceStatus |
(Appears on: InferenceService)
InferenceServiceSpec defines the desired state of InferenceService
Field | Description |
---|---|
default EndpointSpec |
Default defines default InferenceService endpoints |
canary EndpointSpec |
(Optional)
Canary defines alternate endpoints to route a percentage of traffic. |
canaryTrafficPercent int |
(Optional)
CanaryTrafficPercent defines the percentage of traffic going to canary InferenceService endpoints |
InferenceState describes the Readiness of the InferenceService
(Appears on: InferenceService)
InferenceServiceStatus defines the observed state of InferenceService
Field | Description |
---|---|
Status knative.dev/pkg/apis/duck/v1beta1.Status |
(Members of |
url string |
URL of the InferenceService |
traffic int |
Traffic percentage that goes to default services |
canaryTraffic int |
Traffic percentage that goes to canary services |
default map[github.com/kubeflow/kfserving/pkg/constants.InferenceServiceComponent]github.com/kubeflow/kfserving/pkg/apis/serving/v1alpha2.StatusConfigurationSpec |
Statuses for the default endpoints of the InferenceService |
canary map[github.com/kubeflow/kfserving/pkg/constants.InferenceServiceComponent]github.com/kubeflow/kfserving/pkg/apis/serving/v1alpha2.StatusConfigurationSpec |
Statuses for the canary endpoints of the InferenceService |
address knative.dev/pkg/apis/duck/v1beta1.Addressable |
Addressable URL for eventing |
Field | Description |
---|---|
transformers TransformersConfig |
|
predictors PredictorsConfig |
|
explainers ExplainersConfig |
(Appears on: DeploymentSpec)
Logger provides optional payload logging for all endpoints
Field | Description |
---|---|
url string |
(Optional)
URL to send request logging CloudEvents |
mode LoggerMode |
What payloads to log: [all, request, response] |
(Appears on: Logger)
(Appears on: PredictorSpec)
ONNXSpec defines arguments for configuring ONNX model serving.
Field | Description |
---|---|
storageUri string |
The URI of the exported onnx model(model.onnx) |
runtimeVersion string |
ONNXRuntime docker image versions, default version can be set in the inferenceservice configmap |
resources Kubernetes core/v1.ResourceRequirements |
Defaults to requests and limits of 1CPU, 2Gb MEM. |
(Appears on: PredictorsConfig)
Field | Description |
---|---|
image string |
|
defaultImageVersion string |
|
defaultGpuImageVersion string |
(Appears on: EndpointSpec)
PredictorSpec defines the configuration for a predictor, The following fields follow a “1-of” semantic. Users must specify exactly one spec.
Field | Description |
---|---|
custom CustomSpec |
Spec for a custom predictor |
tensorflow TensorflowSpec |
Spec for Tensorflow Serving (https://github.com/tensorflow/serving) |
triton TritonSpec |
Spec for Triton Inference Server (https://github.com/NVIDIA/triton-inference-server) |
xgboost XGBoostSpec |
Spec for XGBoost predictor |
sklearn SKLearnSpec |
Spec for SKLearn predictor |
onnx ONNXSpec |
Spec for ONNX runtime (https://github.com/microsoft/onnxruntime) |
pytorch PyTorchSpec |
Spec for PyTorch predictor |
DeploymentSpec DeploymentSpec |
(Members of |
(Appears on: InferenceServicesConfig)
Field | Description |
---|---|
tensorflow PredictorConfig |
|
triton PredictorConfig |
|
xgboost PredictorConfig |
|
sklearn PredictorConfig |
|
pytorch PredictorConfig |
|
onnx PredictorConfig |
(Appears on: PredictorSpec)
PyTorchSpec defines arguments for configuring PyTorch model serving.
Field | Description |
---|---|
storageUri string |
The URI of the trained model which contains model.pt |
modelClassName string |
Defaults PyTorch model class name to ‘PyTorchModel’ |
runtimeVersion string |
PyTorch KFServer docker image version which defaults to latest release |
resources Kubernetes core/v1.ResourceRequirements |
Defaults to requests and limits of 1CPU, 2Gb MEM. |
(Appears on: PredictorSpec)
SKLearnSpec defines arguments for configuring SKLearn model serving.
Field | Description |
---|---|
storageUri string |
The URI of the trained model which contains model.pickle, model.pkl or model.joblib |
runtimeVersion string |
SKLearn KFServer docker image version which defaults to latest release |
resources Kubernetes core/v1.ResourceRequirements |
Defaults to requests and limits of 1CPU, 2Gb MEM. |
StatusConfigurationSpec describes the state of the configuration receiving traffic.
Field | Description |
---|---|
name string |
Latest revision name that is in ready state |
host string |
Host name of the service |
(Appears on: PredictorSpec)
TensorflowSpec defines arguments for configuring Tensorflow model serving.
Field | Description |
---|---|
storageUri string |
The URI for the saved model(https://www.tensorflow.org/tutorials/keras/save_and_load) |
runtimeVersion string |
TFServing docker image version(https://hub.docker.com/r/tensorflow/serving), default version can be set in the inferenceservice configmap. |
resources Kubernetes core/v1.ResourceRequirements |
Defaults to requests and limits of 1CPU, 2Gb MEM. |
Transformer interface is implemented by all Transformers
(Appears on: TransformersConfig)
Field | Description |
---|---|
image string |
|
defaultImageVersion string |
(Appears on: EndpointSpec)
TransformerSpec defines transformer service for pre/post processing
Field | Description |
---|---|
custom CustomSpec |
Spec for a custom transformer |
DeploymentSpec DeploymentSpec |
(Members of |
(Appears on: InferenceServicesConfig)
Field | Description |
---|---|
feast TransformerConfig |
(Appears on: PredictorSpec)
TritonSpec defines arguments for configuring Triton Inference Server.
Field | Description |
---|---|
storageUri string |
The URI for the trained model repository(https://docs.nvidia.com/deeplearning/triton-inference-server/master-user-guide/docs/model_repository.html) |
runtimeVersion string |
Triton Inference Server docker image version, default version can be set in the inferenceservice configmap |
resources Kubernetes core/v1.ResourceRequirements |
Defaults to requests and limits of 1CPU, 2Gb MEM. |
VirtualServiceStatus captures the status of the virtual service
Field | Description |
---|---|
URL string |
|
CanaryWeight int |
|
DefaultWeight int |
|
address knative.dev/pkg/apis/duck/v1beta1.Addressable |
(Optional)
Address holds the information needed for a Route to be the target of an event. |
Status knative.dev/pkg/apis/duck/v1beta1.Status |
(Appears on: PredictorSpec)
XGBoostSpec defines arguments for configuring XGBoost model serving.
Field | Description |
---|---|
storageUri string |
The URI of the trained model which contains model.bst |
nthread int |
Number of thread to be used by XGBoost |
runtimeVersion string |
XGBoost KFServer docker image version which defaults to latest release |
resources Kubernetes core/v1.ResourceRequirements |
Defaults to requests and limits of 1CPU, 2Gb MEM. |
Generated with gen-crd-api-reference-docs
on git commit d7f65bc
.