SVM is used for classification and regression analysis.
SVM solves the following optimization problem:
where is the regularization term; is the regularization coefficient; is the hinge loss as visualized below:
Angel MLLib uses mini-batch gradient descent optimization method for solving SVM's objective; the algorithm is shown below:
-
Data fromat is set in "ml.data.type", supporting "libsvm" and "dummy" types. For details, see Angel Data Format
-
Feature vector's dimension is set in "ml.feature.num"
-
Algorithm Parameters
- ml.epoch.num: number of epochs
- ml.batch.sample.ratio: sampling rate for each epoch
- ml.num.update.per.epoch: number of mini-batches in each epoch
- ml.data.validate.ratio: proportion of data used for validation, no validation when set to 0
- ml.learn.rate: initial learning rate
- ml.learn.decay: decay rate of the learning rate
- ml.svm.reg.l2: coefficient of the L2 penalty
-
I/O Parameters
- angel.train.data.path: input path for train
- angel.predict.data.path: input path for predict
- ml.feature.num: number of features
- ml.data.type: Angel Data Format, supporting "dummy" and "libsvm"
- angel.save.model.path: save path for trained model
- angel.predict.out.path: output path for predict
- angel.log.path: save path for the log
-
Resource Parameters
- angel.workergroup.number: number of workers
- angel.worker.memory.mb: worker's memory requested in G
- angel.worker.task.number: number of tasks on each worker, default is 1
- angel.ps.number: number of PS
- angel.ps.memory.mb: PS's memory requested in G