You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BackPACK's extensions that rely on the probabilistic interpretation of a loss function as a negative log likelihood (quantities based on the Fisher, i.e. BatchDiagGGNMC, DiagGGNMC, SqrtGGNMC, KFAC) are limited to binary labels for BCEWithLogitsLoss.
This issue serves as documentation for the required steps and problems to support continuous-valued labels.
Description: Currently, we assume binary labels $y_n \in {0; 1}$. In this case, BCEWithLogitsLoss corresponds to the negative log likelihood of a Bernoulli distribution $p(y \mid f_n)$ with $f_{n} \in (0; 1)$ the sigmoid probability.
But BCEWithLogitsLoss also supports continuous labels $y_n \in [0; 1]$. In this case, BCEWithLogitsLoss corresponds to negative log likelihood of a continuous Bernoulli distribution $p(y \mid f_{n}) \propto f_{n}^{y} (1 - f_n)^{1 - y}$, such that $- \log p(y=y_{n} \mid f_{n}) \propto -y_{n} \log(f_n) - (1 - y_n) \log(1 - f_n)$.
Implementation: Depending on the nature of labels (binary or continuous), a different distribution must be used (Bernoulli or continuous Bernoulli) to compute sampled gradients. However, at the moment the _make_distribution function does not take into account the labels, but only receives the subsampled inputs. Hence, the interface must be adapted in order to support continuous labels in BCEWithLogitsLoss.
Problems:
A problem with that is that this approach would determine at run time, which properties the labels satisfy. If however we're using a data set with non-binary labels, but coincidentally feed a batch with binary labels (or a single sample), then this approach will use the wrong distribution. Not sure how to fix this, other than asking the user for the nature of their data.
The text was updated successfully, but these errors were encountered:
BackPACK's extensions that rely on the probabilistic interpretation of a loss function as a negative log likelihood (quantities based on the Fisher, i.e.
BatchDiagGGNMC
,DiagGGNMC
,SqrtGGNMC
,KFAC
) are limited to binary labels forBCEWithLogitsLoss
.This issue serves as documentation for the required steps and problems to support continuous-valued labels.
Description: Currently, we assume binary labels$y_n \in {0; 1}$ . In this case, $p(y \mid f_n)$ with $f_{n} \in (0; 1)$ the sigmoid probability.
BCEWithLogitsLoss
corresponds to the negative log likelihood of a Bernoulli distributionBut$y_n \in [0; 1]$ . In this case, $p(y \mid f_{n}) \propto f_{n}^{y} (1 - f_n)^{1 - y}$ , such that $- \log p(y=y_{n} \mid f_{n}) \propto -y_{n} \log(f_n) - (1 - y_n) \log(1 - f_n)$ .
BCEWithLogitsLoss
also supports continuous labelsBCEWithLogitsLoss
corresponds to negative log likelihood of a continuous Bernoulli distributionImplementation: Depending on the nature of labels (binary or continuous), a different distribution must be used (Bernoulli or continuous Bernoulli) to compute sampled gradients. However, at the moment the
_make_distribution
function does not take into account the labels, but only receives the subsampled inputs. Hence, the interface must be adapted in order to support continuous labels inBCEWithLogitsLoss
.Problems:
The text was updated successfully, but these errors were encountered: