-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: calculating d-prime from confusion matrices and samples #8
base: master
Are you sure you want to change the base?
Changes from 9 commits
340f5d3
fa13692
9db9829
547d490
0b00116
69f0974
43cabf5
5f8f071
a056d02
d6a9cf6
6f5cbac
afa86fe
8babb28
d6ab20b
3f5eb03
60814d8
1d926a2
056aa5e
396224b
dd59101
c2f53ee
ad8e3af
1339ae2
b1d8b77
99f5354
b0d58c1
b28f6f3
0deae47
15295c5
69f89ec
f515857
341d29a
6d48df3
de48e46
cad170b
7c47499
f0a4f1b
2df5287
5b07e4c
962885b
0748c3a
75e3679
95ed1fc
8b2f3e6
ab6df56
608f869
b1dedff
a5357fc
2e34b76
dbd5326
f3fd043
a198d44
b58a0fe
e614a6f
9f0cbd7
d575111
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,3 +3,6 @@ | |
__pycache__ | ||
.idea | ||
build | ||
*.DS_Store | ||
*~ | ||
.*swp |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,15 +2,20 @@ | |
|
||
# Authors: Nicolas Pinto <[email protected]> | ||
# Nicolas Poilvert <[email protected]> | ||
# Ha Hong <[email protected]> | ||
# | ||
# License: BSD | ||
|
||
__all__ = ['dprime'] | ||
__all__ = ['dprime', 'dprime_from_confusion_ova'] | ||
|
||
import numpy as np | ||
from scipy.stats import norm | ||
|
||
DEFAULT_FUDGE_FACTOR = 0.5 | ||
DEFAULT_FUDGE_MODE = 'correction' | ||
|
||
def dprime(y_pred, y_true): | ||
|
||
def dprime(y_pred, y_true, **kwargs): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This wrapper looks redundant, we should merge it with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You mean we don't need to have a separate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep! I think we should have the convention that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wait, do you want to compute d' with sample values ( I think the primary use of And I think the primary use of Well, now it seems to me that we need both for usability. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. still not sure about the kwargs here, and the separation/wrapping around There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm.. I don't think having Just in case if you want to remove CC'ing others @yamins81, @cadieu --- any thoughts? PS: and d-prime is used quite often... at least by me ;-] and there're many different ways to compute so that's why I'm obsessing it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like the wrong pattern in this case, how about just adding a fully documented kwarg that change the interpretation of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sounds good to me. I'll work on that! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought I understood your comment but now I'm not so sure.. Would the new dprime() code (https://github.com/dicarlolab/bangmetric/blob/6d48df3ea74131fb712aaf434b064fb0c287b387/bangmetric/dprime.py) be okay for you? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, if you like the idea about
After that,
--- and of course, |
||
"""Computes the d-prime sensitivity index of the predictions. | ||
|
||
Parameters | ||
|
@@ -23,10 +28,14 @@ def dprime(y_pred, y_true): | |
y_pred: array, shape = [n_samples] | ||
Predicted values (real). | ||
|
||
kwargs: named arguments, optional | ||
Passed to ``dprime_from_samp()``. | ||
|
||
Returns | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Gosh, you're right. Thanks for the note! |
||
------- | ||
dp: float or None | ||
d-prime, None if d-prime is undefined | ||
d-prime, None if d-prime is undefined and raw d-prime value (``safedp=False``) | ||
is not requested (default). | ||
|
||
References | ||
---------- | ||
|
@@ -45,18 +54,145 @@ def dprime(y_pred, y_true): | |
assert y_pred.ndim == 1 | ||
|
||
# -- actual computation | ||
pos = y_true > 0 | ||
neg = ~pos | ||
pos_mean = y_pred[pos].mean() | ||
neg_mean = y_pred[neg].mean() | ||
pos_var = y_pred[pos].var(ddof=1) | ||
neg_var = y_pred[neg].var(ddof=1) | ||
i_pos = y_true > 0 | ||
i_neg = ~i_pos | ||
|
||
pos = y_pred[i_pos] | ||
neg = y_pred[i_neg] | ||
|
||
dp = dprime_from_samp(pos, neg, bypass_nchk=True, **kwargs) | ||
return dp | ||
|
||
|
||
def dprime_from_samp(pos, neg, maxv=None, minv=None, safedp=True, bypass_nchk=False): | ||
"""Computes the d-prime sensitivity index from positive and negative samples. | ||
|
||
Parameters | ||
---------- | ||
pos: array-like | ||
Positive sample values (e.g., raw projection values of the positive classifier). | ||
|
||
neg: array-like | ||
Negative sample values. | ||
|
||
maxv: float, optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. max_value : None or float, optional There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. |
||
Maximum possible d-prime value. If None (default), there's no limit on | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In this case we should use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. |
||
the maximum value. | ||
|
||
minv: float, optional | ||
Minimum possible d-prime value. If None (default), there's no limit. | ||
|
||
safedp: bool, optional | ||
If True (default), this function will return None if the resulting d-prime | ||
value becomes non-finite. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this needed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should it raise an exception instead? Or let the user deal with it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, I inherited that from the existing code. I don't like returning |
||
|
||
bypass_nchk: bool, optional | ||
If False (default), do not bypass the test to ensure that enough positive | ||
and negatives samples are there for the variance estimation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any context where this flag would be used? Should we delegate this to the caller? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. Will remove that part. |
||
|
||
Returns | ||
------- | ||
dp: float or None | ||
d-prime, None if d-prime is undefined and raw d-prime value (``safedp=False``) | ||
is not requested (default). | ||
|
||
References | ||
---------- | ||
http://en.wikipedia.org/wiki/D' | ||
""" | ||
|
||
pos = np.array(pos) | ||
neg = np.array(neg) | ||
|
||
if not bypass_nchk: | ||
assert pos.size > 1, 'Not enough positive samples to estimate the variance' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should be an exception There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK |
||
assert neg.size > 1, 'Not enough negative samples to estimate the variance' | ||
|
||
pos_mean = pos.mean() | ||
neg_mean = neg.mean() | ||
pos_var = pos.var(ddof=1) | ||
neg_var = neg.var(ddof=1) | ||
|
||
num = pos_mean - neg_mean | ||
div = np.sqrt((pos_var + neg_var) / 2.) | ||
if div == 0: | ||
|
||
# from Dan's suggestion about clipping d' values... | ||
if maxv is None: | ||
maxv = np.inf | ||
if minv is None: | ||
minv = -np.inf | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should be their default value? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. |
||
|
||
dp = np.clip(num / div, minv, maxv) | ||
|
||
if safedp and not np.isfinite(dp): | ||
dp = None | ||
|
||
return dp | ||
|
||
|
||
def dprime_from_confusion_ova(M, fudge_mode=DEFAULT_FUDGE_MODE, \ | ||
fudge_fac=DEFAULT_FUDGE_FACTOR): | ||
"""Computes the one-vs-all d-prime sensitivity index of the confusion matrix. | ||
|
||
Parameters | ||
---------- | ||
M: array, shape = [n_classes (true), n_classes (pred)] | ||
Confusion matrix, where the element M_{rc} means the number of | ||
times when the classifier guesses that a test sample in the r-th class | ||
belongs to the c-th class. | ||
|
||
fudge_fac: float, optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK |
||
A small factor to avoid non-finite numbers when TPR or FPR becomes 0 or 1. | ||
|
||
fudge_mode: str, optional | ||
Determins how to apply the fudge factor | ||
'always': always apply the fudge factor | ||
'correction': apply only when needed | ||
|
||
Returns | ||
------- | ||
dp: array, shape = [n_classes] | ||
Array of d-primes, each element corresponding to each class | ||
|
||
References | ||
---------- | ||
http://en.wikipedia.org/wiki/D' | ||
http://en.wikipedia.org/wiki/Confusion_matrix | ||
""" | ||
|
||
M = np.array(M) | ||
assert M.ndim == 2 | ||
assert M.shape[0] == M.shape[1] | ||
|
||
P = np.sum(M, axis=1) # number of positives, for each class | ||
N = np.sum(P) - P | ||
|
||
TP = np.diag(M) | ||
FP = np.sum(M, axis=0) - TP | ||
|
||
if fudge_mode == 'always': # always apply fudge factor | ||
TPR = (TP.astype('float') + fudge_fac) / (P + 2.*fudge_fac) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. float64 |
||
FPR = (FP.astype('float') + fudge_fac) / (N + 2.*fudge_fac) | ||
|
||
elif fudge_mode == 'correction': # apply fudge factor only when needed | ||
TP = TP.astype('float') | ||
FP = FP.astype('float') | ||
|
||
TP[TP == P] = P[TP == P] - fudge_fac # 100% correct | ||
TP[TP == 0] = fudge_fac # 0% correct | ||
FP[FP == N] = N[FP == N] - fudge_fac # always FAR | ||
FP[FP == 0] = fudge_fac # no false alarm | ||
|
||
TPR = TP / P | ||
FPR = FP / N | ||
|
||
else: | ||
dp = num / div | ||
assert False, 'Not implemented' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK |
||
|
||
dp = norm.ppf(TPR) - norm.ppf(FPR) | ||
# if there's only two dp's then, it's must be "A" vs. "~A" task. If so, just give one value | ||
if len(dp) == 2: | ||
dp = np.array([dp[0]]) | ||
|
||
return dp | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about the
_ova
part, it could be made more generic if one provide the binary output codes used to compute the confusion matrix -- what do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default it could be interpreted as OvR (only if it is square), but any binary-like output code could work -- we would just need a convention for it (e.g.
-1 = False
,+1 = True
,0 = ignore
), at this stage we should just assume it'sOvR
and let the user know in the docstring, but possibly open up the possibility of other output codes (e.g.OvO
).Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a definite possibility. I like that convention --- will work on that.
Meanwhile, the current version of
dprime_from_confusion_ova()
is meant to be used when there's no access to internal representation / decision making (like human data). This function computes n OVA d's from the given n-by-n confusion matrix --- but I admit that there's no clear analytical/mathematical connection between d' computed from a n-way confusion matrix and a 2-way binary classifier.