Skip to content

Latest commit

 

History

History
64 lines (37 loc) · 3.25 KB

scala.md

File metadata and controls

64 lines (37 loc) · 3.25 KB

Scala API for XGBoost-Spark3.0

This doc focuses on GPU related Scala API interfaces, and fortunately only one new API is introduced to support training on GPU.

XGBoost-Spark3.0 provides four classes as below to support ML things on spark:

XGBoostClassifier

The full name is ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier. It extends ProbabilisticClassifier[Vector, XGBoostClassifier, XGBoostClassificationModel].

Constructors
  • XGBoostClassifier(xgboostParams: Map[String, Any])
    • all standard xgboost parameters are supported
    • eval_sets: Map[String,DataFrame] is used to set the named evaluation dataset(s) for training.
Methods

Note: Only GPU related methods are listed below.

  • setFeaturesCols(value: Seq[String]): XGBoostClassifier. This method sets the feature columns for training.
    • value: a sequence of feature column name
    • returns the classifier itself

XGBoostClassificationModel

The full name is ml.dmlc.xgboost4j.scala.spark.XGBoostClassificationModel. It extends ProbabilisticClassificationModel[Vector, XGBoostClassificationModel].

Methods

No GPU specific methods, use it as a normal spark model.

XGBoostRegressor

The full name is ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor. It extends Predictor[Vector, XGBoostRegressor, XGBoostRegressionModel].

Constructors
  • XGBoostRegressor(xgboostParams: Map[String, Any])
    • all standard xgboost parameters are supported
    • eval_sets: Map[String,DataFrame] is used to set the named evaluation dataset(s) for training.
Methods

Note: Only GPU related methods are listed below.

  • setFeaturesCols(value: Seq[String]): XGBoostRegressor. This method sets the feature columns for training.
    • value: a sequence of feature column names to set
    • returns the regressor itself

XGBoostRegressionModel

The full name is ml.dmlc.xgboost4j.scala.spark.XGBoostRegressionModel. It extends PredictionModel[Vector, XGBoostRegressionModel].

Methods

No GPU specific methods, use it as a normal spark model.