-
Notifications
You must be signed in to change notification settings - Fork 214
[BUG] Results from liblinear and SAGAClassifier do not match #113
Comments
I should add that SAGClassifier (for l2 penalty) seems to behave the same way. iris = load_iris()
X, y = iris.data, iris.target
X = np.concatenate([X] * 10)
y = np.concatenate([y] * 10)
X_bin = X[y <= 1]
y_bin = y[y <= 1] * 2 - 1
X_sparse, y_sparse = make_classification(n_samples=1000, n_features=50,
random_state=0)
X_sparse = sparse.csr_matrix(X_sparse)
for (X, y) in ((X_bin, y_bin), (X_sparse, y_sparse)):
for penalty in ['l2']:
n_samples = X.shape[0]
for alpha in np.logspace(-3, 3, 5):
liblinear = LogisticRegression(
C=1. / (n_samples * alpha),
solver='liblinear',
multi_class='ovr',
max_iter=500,
fit_intercept=False,
penalty=penalty, random_state=0, tol=1e-24)
if penalty == 'l1':
lalpha = 0
lbeta = alpha
lpenalty = 'l1'
else:
lalpha = alpha
lbeta = 0
lpenalty = None
lsag = SAGClassifier(loss='log',
beta=lbeta, penalty=lpenalty,
alpha=lalpha,
max_iter=3000,
random_state=0)
lsaga = SAGAClassifier(loss='log',
beta=lbeta, penalty=lpenalty,
alpha=lalpha,
max_iter=3000,
random_state=0)
lsaga.fit(X, y)
lsag.fit(X, y)
liblinear.fit(X, y)
print('[penalty=%s, alpha=%s, solver=liblinear]' % (penalty, alpha),
liblinear.coef_[0, :4])
print('[penalty=%s, alpha=%s, solver=lightning saga]' % (penalty, alpha),
lsaga.coef_[0, :4])
print('[penalty=%s, alpha=%s, solver=lightning sag]' % (penalty, alpha),
lsag.coef_[0, :4])
print('-------------------------------')
|
Add tol=-1 to the kwargs of SAG*Classifier :-) Also, note that lightning has a definition the logistic loss that is very slightly different from that of liblinear. In particular, the one in lightning is an approximation using some heuristics. See https://github.com/scikit-learn-contrib/lightning/blob/master/lightning/impl/sgd_fast.pyx , class Log . This only matters typically if you look beyond 10^-4 (e.g. if you plot convergence plots). |
Indeed, sorry for the noise, it now works. Do you use this approximation for speed or underflow control ? |
Mathieu did this, but I would say speed as underflow can be controlled by other ways (liblinear does this pretty well). |
If I am not mistaken, the log loss trick comes from scikit-learn, which itself comes from Leon Bottou's SGD code. |
I wanted to check the differences between the output of SAGAClassifier and the output of liblinear, from LogisticRegression in sklearn. It turns out that both estimators return very different coefficients when solving what I thought would be the same problem. I suspect that I do not set the regularisation parameters correctly, would you be able to enlighten me ?
returns
The text was updated successfully, but these errors were encountered: