Skip to content

Latest commit

 

History

History
1688 lines (1688 loc) · 27.9 KB

README.md

File metadata and controls

1688 lines (1688 loc) · 27.9 KB

Packages:

serving.kubeflow.org/v1alpha2

Package v1alpha2 contains API Schema definitions for the serving v1alpha2 API group

Resource Types:

    AlibiExplainerSpec

    (Appears on: ExplainerSpec)

    AlibiExplainerSpec defines the arguments for configuring an Alibi Explanation Server

    Field Description
    type
    AlibiExplainerType

    The type of Alibi explainer

    storageUri
    string

    The location of a trained explanation model

    runtimeVersion
    string

    Alibi docker image version which defaults to latest release

    resources
    Kubernetes core/v1.ResourceRequirements

    Defaults to requests and limits of 1CPU, 2Gb MEM.

    config
    map[string]string

    Inline custom parameter settings for explainer

    AlibiExplainerType (string alias)

    (Appears on: AlibiExplainerSpec)

    Batcher

    (Appears on: DeploymentSpec)

    Batcher provides optional payload batcher for all endpoints

    Field Description
    maxBatchSize
    int
    (Optional)

    MaxBatchSize of batcher service

    maxLatency
    int
    (Optional)

    MaxLatency of batcher service

    timeout
    int
    (Optional)

    Timeout of batcher service

    ComponentStatusMap

    EndpointStatusMap defines the observed state of InferenceService endpoints

    CustomSpec

    (Appears on: ExplainerSpec, PredictorSpec, TransformerSpec)

    CustomSpec provides a hook for arbitrary container configuration.

    Field Description
    container
    Kubernetes core/v1.Container

    DeploymentSpec

    (Appears on: ExplainerSpec, PredictorSpec, TransformerSpec)

    DeploymentSpec defines the configuration for a given InferenceService service component

    Field Description
    serviceAccountName
    string
    (Optional)

    ServiceAccountName is the name of the ServiceAccount to use to run the service

    minReplicas
    int
    (Optional)

    Minimum number of replicas which defaults to 1, when minReplicas = 0 pods scale down to 0 in case of no traffic

    maxReplicas
    int
    (Optional)

    This is the up bound for autoscaler to scale to

    parallelism
    int

    Parallelism specifies how many requests can be processed concurrently, this sets the hard limit of the container concurrency(https://knative.dev/docs/serving/autoscaling/concurrency).

    logger
    Logger
    (Optional)

    Activate request/response logging

    batcher
    Batcher
    (Optional)

    Activate request batching

    EndpointSpec

    (Appears on: InferenceServiceSpec)

    Field Description
    predictor
    PredictorSpec

    Predictor defines the model serving spec

    explainer
    ExplainerSpec
    (Optional)

    Explainer defines the model explanation service spec, explainer service calls to predictor or transformer if it is specified.

    transformer
    TransformerSpec
    (Optional)

    Transformer defines the pre/post processing before and after the predictor call, transformer service calls to predictor service.

    Explainer

    ExplainerConfig

    (Appears on: ExplainersConfig)

    Field Description
    image
    string
    defaultImageVersion
    string

    ExplainerSpec

    (Appears on: EndpointSpec)

    ExplainerSpec defines the arguments for a model explanation server, The following fields follow a “1-of” semantic. Users must specify exactly one spec.

    Field Description
    alibi
    AlibiExplainerSpec

    Spec for alibi explainer

    custom
    CustomSpec

    Spec for a custom explainer

    DeploymentSpec
    DeploymentSpec

    (Members of DeploymentSpec are embedded into this type.)

    ExplainersConfig

    (Appears on: InferenceServicesConfig)

    Field Description
    alibi
    ExplainerConfig

    InferenceService

    InferenceService is the Schema for the services API

    Field Description
    metadata
    Kubernetes meta/v1.ObjectMeta
    Refer to the Kubernetes API documentation for the fields of the metadata field.
    spec
    InferenceServiceSpec


    default
    EndpointSpec

    Default defines default InferenceService endpoints

    canary
    EndpointSpec
    (Optional)

    Canary defines alternate endpoints to route a percentage of traffic.

    canaryTrafficPercent
    int
    (Optional)

    CanaryTrafficPercent defines the percentage of traffic going to canary InferenceService endpoints

    status
    InferenceServiceStatus

    InferenceServiceSpec

    (Appears on: InferenceService)

    InferenceServiceSpec defines the desired state of InferenceService

    Field Description
    default
    EndpointSpec

    Default defines default InferenceService endpoints

    canary
    EndpointSpec
    (Optional)

    Canary defines alternate endpoints to route a percentage of traffic.

    canaryTrafficPercent
    int
    (Optional)

    CanaryTrafficPercent defines the percentage of traffic going to canary InferenceService endpoints

    InferenceServiceState (string alias)

    InferenceState describes the Readiness of the InferenceService

    InferenceServiceStatus

    (Appears on: InferenceService)

    InferenceServiceStatus defines the observed state of InferenceService

    Field Description
    Status
    knative.dev/pkg/apis/duck/v1beta1.Status

    (Members of Status are embedded into this type.)

    url
    string

    URL of the InferenceService

    traffic
    int

    Traffic percentage that goes to default services

    canaryTraffic
    int

    Traffic percentage that goes to canary services

    default
    map[github.com/kubeflow/kfserving/pkg/constants.InferenceServiceComponent]github.com/kubeflow/kfserving/pkg/apis/serving/v1alpha2.StatusConfigurationSpec

    Statuses for the default endpoints of the InferenceService

    canary
    map[github.com/kubeflow/kfserving/pkg/constants.InferenceServiceComponent]github.com/kubeflow/kfserving/pkg/apis/serving/v1alpha2.StatusConfigurationSpec

    Statuses for the canary endpoints of the InferenceService

    address
    knative.dev/pkg/apis/duck/v1beta1.Addressable

    Addressable URL for eventing

    InferenceServicesConfig

    Field Description
    transformers
    TransformersConfig
    predictors
    PredictorsConfig
    explainers
    ExplainersConfig

    Logger

    (Appears on: DeploymentSpec)

    Logger provides optional payload logging for all endpoints

    Field Description
    url
    string
    (Optional)

    URL to send request logging CloudEvents

    mode
    LoggerMode

    What payloads to log: [all, request, response]

    LoggerMode (string alias)

    (Appears on: Logger)

    ONNXSpec

    (Appears on: PredictorSpec)

    ONNXSpec defines arguments for configuring ONNX model serving.

    Field Description
    storageUri
    string

    The URI of the exported onnx model(model.onnx)

    runtimeVersion
    string

    ONNXRuntime docker image versions, default version can be set in the inferenceservice configmap

    resources
    Kubernetes core/v1.ResourceRequirements

    Defaults to requests and limits of 1CPU, 2Gb MEM.

    Predictor

    PredictorConfig

    (Appears on: PredictorsConfig)

    Field Description
    image
    string
    defaultImageVersion
    string
    defaultGpuImageVersion
    string

    PredictorSpec

    (Appears on: EndpointSpec)

    PredictorSpec defines the configuration for a predictor, The following fields follow a “1-of” semantic. Users must specify exactly one spec.

    Field Description
    custom
    CustomSpec

    Spec for a custom predictor

    tensorflow
    TensorflowSpec

    Spec for Tensorflow Serving (https://github.com/tensorflow/serving)

    triton
    TritonSpec

    Spec for Triton Inference Server (https://github.com/NVIDIA/triton-inference-server)

    xgboost
    XGBoostSpec

    Spec for XGBoost predictor

    sklearn
    SKLearnSpec

    Spec for SKLearn predictor

    onnx
    ONNXSpec

    Spec for ONNX runtime (https://github.com/microsoft/onnxruntime)

    pytorch
    PyTorchSpec

    Spec for PyTorch predictor

    DeploymentSpec
    DeploymentSpec

    (Members of DeploymentSpec are embedded into this type.)

    PredictorsConfig

    (Appears on: InferenceServicesConfig)

    Field Description
    tensorflow
    PredictorConfig
    triton
    PredictorConfig
    xgboost
    PredictorConfig
    sklearn
    PredictorConfig
    pytorch
    PredictorConfig
    onnx
    PredictorConfig

    PyTorchSpec

    (Appears on: PredictorSpec)

    PyTorchSpec defines arguments for configuring PyTorch model serving.

    Field Description
    storageUri
    string

    The URI of the trained model which contains model.pt

    modelClassName
    string

    Defaults PyTorch model class name to ‘PyTorchModel’

    runtimeVersion
    string

    PyTorch KFServer docker image version which defaults to latest release

    resources
    Kubernetes core/v1.ResourceRequirements

    Defaults to requests and limits of 1CPU, 2Gb MEM.

    SKLearnSpec

    (Appears on: PredictorSpec)

    SKLearnSpec defines arguments for configuring SKLearn model serving.

    Field Description
    storageUri
    string

    The URI of the trained model which contains model.pickle, model.pkl or model.joblib

    runtimeVersion
    string

    SKLearn KFServer docker image version which defaults to latest release

    resources
    Kubernetes core/v1.ResourceRequirements

    Defaults to requests and limits of 1CPU, 2Gb MEM.

    StatusConfigurationSpec

    StatusConfigurationSpec describes the state of the configuration receiving traffic.

    Field Description
    name
    string

    Latest revision name that is in ready state

    host
    string

    Host name of the service

    TensorflowSpec

    (Appears on: PredictorSpec)

    TensorflowSpec defines arguments for configuring Tensorflow model serving.

    Field Description
    storageUri
    string

    The URI for the saved model(https://www.tensorflow.org/tutorials/keras/save_and_load)

    runtimeVersion
    string

    TFServing docker image version(https://hub.docker.com/r/tensorflow/serving), default version can be set in the inferenceservice configmap.

    resources
    Kubernetes core/v1.ResourceRequirements

    Defaults to requests and limits of 1CPU, 2Gb MEM.

    Transformer

    Transformer interface is implemented by all Transformers

    TransformerConfig

    (Appears on: TransformersConfig)

    Field Description
    image
    string
    defaultImageVersion
    string

    TransformerSpec

    (Appears on: EndpointSpec)

    TransformerSpec defines transformer service for pre/post processing

    Field Description
    custom
    CustomSpec

    Spec for a custom transformer

    DeploymentSpec
    DeploymentSpec

    (Members of DeploymentSpec are embedded into this type.)

    TransformersConfig

    (Appears on: InferenceServicesConfig)

    Field Description
    feast
    TransformerConfig

    TritonSpec

    (Appears on: PredictorSpec)

    TritonSpec defines arguments for configuring Triton Inference Server.

    Field Description
    storageUri
    string

    The URI for the trained model repository(https://docs.nvidia.com/deeplearning/triton-inference-server/master-user-guide/docs/model_repository.html)

    runtimeVersion
    string

    Triton Inference Server docker image version, default version can be set in the inferenceservice configmap

    resources
    Kubernetes core/v1.ResourceRequirements

    Defaults to requests and limits of 1CPU, 2Gb MEM.

    VirtualServiceStatus

    VirtualServiceStatus captures the status of the virtual service

    Field Description
    URL
    string
    CanaryWeight
    int
    DefaultWeight
    int
    address
    knative.dev/pkg/apis/duck/v1beta1.Addressable
    (Optional)

    Address holds the information needed for a Route to be the target of an event.

    Status
    knative.dev/pkg/apis/duck/v1beta1.Status

    XGBoostSpec

    (Appears on: PredictorSpec)

    XGBoostSpec defines arguments for configuring XGBoost model serving.

    Field Description
    storageUri
    string

    The URI of the trained model which contains model.bst

    nthread
    int

    Number of thread to be used by XGBoost

    runtimeVersion
    string

    XGBoost KFServer docker image version which defaults to latest release

    resources
    Kubernetes core/v1.ResourceRequirements

    Defaults to requests and limits of 1CPU, 2Gb MEM.


    Generated with gen-crd-api-reference-docs on git commit d7f65bc.