Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About Label Smoothing #55

Open
helenxu opened this issue Nov 29, 2019 · 4 comments
Open

About Label Smoothing #55

helenxu opened this issue Nov 29, 2019 · 4 comments
Assignees
Labels

Comments

@helenxu
Copy link

helenxu commented Nov 29, 2019

I read from line 141 in main.py about label smoothing:

e2_multi = ((1.0-args.label_smoothing)*e2_multi) + (1.0/e2_multi.size(1))

Isn't it should be the following instead?

e2_multi = ((1.0-args.label_smoothing)*e2_multi) + (args.label_smoothing/e2_multi.size(1))
@TimDettmers
Copy link
Owner

Yes, you are correct — thank you for reporting this! I will need to study if the results change only slightly or significantly. If the difference is only slightly I will introduce it directly into the codebase. If the difference is significant, I will need to build some workaround.

@TimDettmers TimDettmers self-assigned this Dec 14, 2019
@TimDettmers
Copy link
Owner

With the fixed label smoothing, one needs much higher label smoothing values to get a good score. Here the results of my grid search. The mean metric value is the Mean Reciprocal Rank (MRR).

WN18RR Bugged

Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.2'), ('lr', '0.003')):
Metric mean value (SE): 0.424 (0.0003). 95% CI (0.424, 0.425). Sample size: 2
================================================================================
Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.1'), ('lr', '0.003')):
Metric mean value (SE): 0.424 (0.0002). 95% CI (0.424, 0.425). Sample size: 2
================================================================================
Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.4'), ('lr', '0.003')):
Metric mean value (SE): 0.425 (0.0004). 95% CI (0.424, 0.426). Sample size: 2
================================================================================
Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.5'), ('lr', '0.003')):
Metric mean value (SE): 0.425 (0.0005). 95% CI (0.424, 0.425). Sample size: 2
================================================================================
Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.3'), ('lr', '0.003')):
Metric mean value (SE): 0.425 (0.0011). 95% CI (0.423, 0.428). Sample size: 2

WN18RR Fixed

Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.2'), ('lr', '0.003')):
Metric mean value (SE): 0.421 (0.0009). 95% CI (0.419, 0.423). Sample size: 2
================================================================================
Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.5'), ('lr', '0.003')):
Metric mean value (SE): 0.422 (0.0011). 95% CI (0.420, 0.425). Sample size: 2
================================================================================
Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.3'), ('lr', '0.003')):
Metric mean value (SE): 0.421 (0.0002). 95% CI (0.421, 0.422). Sample size: 2
================================================================================
Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.4'), ('lr', '0.003')):
Metric mean value (SE): 0.423 (0.0006). 95% CI (0.422, 0.424). Sample size: 2
================================================================================
Summary for config (('data', 'WN18RR'), ('epochs', '150'), ('label_smoothing', '0.1'), ('lr', '0.003')):
Metric mean value (SE): 0.417 (0.0012). 95% CI (0.415, 0.420). Sample size: 2

FB15k-237 Bugged

Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.3'), ('lr', '0.001')):
Metric mean value (SE): 0.325 (0.0006). 95% CI (0.324, 0.326). Sample size: 2
================================================================================
Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.5'), ('lr', '0.001')):
Metric mean value (SE): 0.322 (0.0000). 95% CI (0.322, 0.322). Sample size: 2
================================================================================
Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.4'), ('lr', '0.001')):
Metric mean value (SE): 0.323 (0.0009). 95% CI (0.321, 0.325). Sample size: 2
================================================================================
Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.1'), ('lr', '0.001')):
Metric mean value (SE): 0.324 (0.0016). 95% CI (0.321, 0.327). Sample size: 2
================================================================================
Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.2'), ('lr', '0.001')):
Metric mean value (SE): 0.324 (0.0002). 95% CI (0.323, 0.324). Sample size: 2

FB15k-237 Fixed

Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.5'), ('lr', '0.001')):
Metric mean value (SE): 0.325 (0.0004). 95% CI (0.324, 0.325). Sample size: 2
================================================================================
Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.3'), ('lr', '0.001')):
Metric mean value (SE): 0.321 (0.0003). 95% CI (0.321, 0.322). Sample size: 2
================================================================================
Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.1'), ('lr', '0.001')):
Metric mean value (SE): 0.319 (nan). 95% CI (nan, nan). Sample size: 1
================================================================================
Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.2'), ('lr', '0.001')):
Metric mean value (SE): 0.319 (0.0011). 95% CI (0.316, 0.321). Sample size: 2
================================================================================
Summary for config (('data', 'FB15k-237'), ('epochs', '150'), ('label_smoothing', '0.4'), ('lr', '0.001')):
Metric mean value (SE): 0.320 (0.0005). 95% CI (0.319, 0.321). Sample size: 2

As such it would would distort the results if I just change this blindly. I will think about a solution. Probably I will make an extra parameter where one can run with correct label smoothing.

@lvermue
Copy link

lvermue commented Dec 17, 2019

I think that the correct way should be
e2_multi = e2_multi * (1-Config.label_smoothing_epsilon) + (1 - e2_multi) * (Config.label_smoothing_epsilon / (e2_multi.size(1) - 1))
citing
http://www.deeplearningbook.org/contents/regularization.html, Chapter 7.5.1.

@nxznm
Copy link

nxznm commented Jan 12, 2021

Hi @lvermue , I think the equation still has some problem. As each position in e2_multi whose label is 1, they will become 1*(1-Config.label_smoothing_epsilon), and position in e2_multi whose label is 0, will become Config.label_smoothing_epsilon / (e2_multi.size(1) - 1). However, there is more than one position in e2_multi whose label is 1, so the sum of the distribution is more than 1, while the correct sum of the distribution should be 1. Do you have some idea?

I think that the correct way should be
e2_multi = e2_multi * (1-Config.label_smoothing_epsilon) + (1 - e2_multi) * (Config.label_smoothing_epsilon / (e2_multi.size(1) - 1))
citing
http://www.deeplearningbook.org/contents/regularization.html, Chapter 7.5.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants