Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: calculating d-prime from confusion matrices and samples #8

Open
wants to merge 56 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
340f5d3
MISC: updating .gitignore
hahong Jul 17, 2012
fa13692
ENH: added d-prime calculation from a confusion matrix
hahong Jul 17, 2012
9db9829
Merge branch 'master' of https://github.com/npinto/bangmetric
hahong Jul 17, 2012
547d490
Merge branch 'feature_dprime'
hahong Jul 17, 2012
0b00116
MISC: small cosmetics changes and assertions to check positives and n…
hahong Jul 17, 2012
69f0974
ENH: added d-prime calcualtion function that directly takes sample va…
hahong Jul 18, 2012
43cabf5
MISC: small chanages for 2x2 confusion matrix d' calculation
hahong Jul 18, 2012
5f8f071
MISC: no need to "balance" data for d' calculation
hahong Jul 18, 2012
a056d02
Merge branch 'feature_dprime'
hahong Jul 18, 2012
d6a9cf6
MISC: addressing most stuffs in github.com/npinto/bangmetric/pull/8 (…
hahong Jul 19, 2012
6f5cbac
DOC: small retouches
hahong Jul 19, 2012
afa86fe
COSMIT
hahong Jul 19, 2012
8babb28
COSMIT
hahong Jul 19, 2012
d6ab20b
ENH: more general dprime_from_confusion (thanks, @npinto!)
hahong Jul 19, 2012
3f5eb03
Merge branch 'feature_dprime'
hahong Jul 19, 2012
60814d8
COSMIT
hahong Jul 19, 2012
1d926a2
Merge branch 'feature_dprime'
hahong Jul 19, 2012
056aa5e
ENH: refactoring out a function that computes stats of a confu matrix.
hahong Jul 19, 2012
396224b
COSMIT: refactoring confusion matrix handling part
hahong Jul 19, 2012
dd59101
Merge branch 'feature_utils' into feature_dprime
hahong Jul 19, 2012
c2f53ee
Merge branch 'feature_dprime'
hahong Jul 19, 2012
ad8e3af
COSMIT
hahong Jul 19, 2012
1339ae2
Merge branch 'feature_utils' into feature_dprime
hahong Jul 19, 2012
b1d8b77
DOC: small changes
hahong Jul 19, 2012
99f5354
Merge branch 'feature_utils' into feature_dprime
hahong Jul 19, 2012
b0d58c1
DOC: small changes
hahong Jul 19, 2012
b28f6f3
Merge branch 'feature_utils' into feature_dprime
hahong Jul 19, 2012
0deae47
Merge branch 'feature_dprime'
hahong Jul 19, 2012
15295c5
COSMIT: combined dprime() and dprime_from_samp()
hahong Jul 21, 2012
69f89ec
COSMIT
hahong Jul 21, 2012
f515857
Merge branch 'feature_utils' into feature_dprime
hahong Jul 21, 2012
341d29a
COSMIT
hahong Jul 21, 2012
6d48df3
Merge branch 'feature_dprime'
hahong Jul 21, 2012
de48e46
MISC: small errors and cosmetic changes
hahong Jul 24, 2012
cad170b
MISC: merge dprime_from_confusion_matrix and dprime
hahong Jul 24, 2012
7c47499
Merge branch 'feature_dprime'
hahong Jul 24, 2012
f0a4f1b
DOC: small changes
hahong Jul 24, 2012
2df5287
Merge branch 'master' into feature_utils
hahong Jul 24, 2012
5b07e4c
COSMIT
hahong Jul 24, 2012
962885b
Merge branch 'feature_utils'
hahong Jul 24, 2012
0748c3a
ENH: added metrics for human data
hahong Jul 24, 2012
75e3679
Merge branch 'feature_humans'
hahong Jul 24, 2012
95ed1fc
COSMIT
hahong Jul 24, 2012
8b2f3e6
Merge branch 'feature_humans'
hahong Jul 24, 2012
ab6df56
ENH: added confusion matrix support to accuracy()
hahong Jul 24, 2012
608f869
Merge branch 'feature_machlearning'
hahong Jul 24, 2012
b1dedff
DOC: misc changes
hahong Jul 24, 2012
a5357fc
Merge branch 'feature_dprime'
hahong Jul 24, 2012
2e34b76
TST: fixed bugs in reference value
hahong Jul 24, 2012
dbd5326
Merge branch 'feature_dprime'
hahong Jul 24, 2012
f3fd043
MISC: small changes to clip ppf values in dprime()
hahong Jul 25, 2012
a198d44
Merge branch 'feature_dprime'
hahong Jul 25, 2012
b58a0fe
DOC: small typos..
hahong Jul 25, 2012
e614a6f
Merge branch 'feature_humans'
hahong Jul 25, 2012
9f0cbd7
fixed a bug: np.sort makes a copy while array.sort is inplace
cadieu Oct 26, 2012
d575111
updated installation
qbilius Dec 7, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@
__pycache__
.idea
build
*.DS_Store
*~
.*swp
158 changes: 147 additions & 11 deletions bangmetric/dprime.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,20 @@

# Authors: Nicolas Pinto <[email protected]>
# Nicolas Poilvert <[email protected]>
# Ha Hong <[email protected]>
#
# License: BSD

__all__ = ['dprime']
__all__ = ['dprime', 'dprime_from_confusion_ova']
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about the _ova part, it could be made more generic if one provide the binary output codes used to compute the confusion matrix -- what do you think?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default it could be interpreted as OvR (only if it is square), but any binary-like output code could work -- we would just need a convention for it (e.g. -1 = False, +1 = True, 0 = ignore), at this stage we should just assume it's OvR and let the user know in the docstring, but possibly open up the possibility of other output codes (e.g. OvO).

Thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a definite possibility. I like that convention --- will work on that.

Meanwhile, the current version of dprime_from_confusion_ova() is meant to be used when there's no access to internal representation / decision making (like human data). This function computes n OVA d's from the given n-by-n confusion matrix --- but I admit that there's no clear analytical/mathematical connection between d' computed from a n-way confusion matrix and a 2-way binary classifier.


import numpy as np
from scipy.stats import norm

DEFAULT_FUDGE_FACTOR = 0.5
DEFAULT_FUDGE_MODE = 'correction'

def dprime(y_pred, y_true):

def dprime(y_pred, y_true, **kwargs):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wrapper looks redundant, we should merge it with dprime_from_samp.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean we don't need to have a separate dprime() and dprime_from_samp() --- and you mean we should compute d' directly from positives and negatives like in dprime_from_samp(), right? If this is what you think, I completely agree with you and can reflect that immediately.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! I think we should have the convention that y_pred should be binary, or interpreted as such (with positives => True, else False), what's your take on that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, do you want to compute d' with sample values (pos and neg) or with y_pred and y_true?

I think the primary use of dprime() would be for classifier outputs: as y_pred and y_true would be already available from the classification result.

And I think the primary use of dprime_samp() would be computing d' with raw neural outputs or human subject magnitude data. For instance, consider computing d' for one neuron's face sensitivity; based on its firing rate when a face is presented vs blank is presented --- in this case, user needs to repack the raw data into y_pred and y_true.

Well, now it seems to me that we need both for usability.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still not sure about the kwargs here, and the separation/wrapping around dprime_from_samp -- do we have to make it that difficult/over parametrize for such a simple, not often used metric?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I don't think having kwargs hurts us. Except that kwargs, dprime() is already simple to use. Any advanced/cautious users will look up kwargs anyways --- and others will simply ignore kwargs and still dprime() will work happily.

Just in case if you want to remove dprime_from_samp(), I would like to keep dprime_from_samp(), cause I often encounter cases where pos and neg are naturally available first and thus I need to re-package them into y_true and y_pred if I were to use dprime(). I know this repackaging would be one line code, but --- if we didn't have a separate dprime_from_samp() --- the dprime() will anyways unpack the things in y_true and y_pred into pos and neg internally, and this feels to me unnecessary.

CC'ing others @yamins81, @cadieu --- any thoughts?

PS: and d-prime is used quite often... at least by me ;-] and there're many different ways to compute so that's why I'm obsessing it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the wrong pattern in this case, how about just adding a fully documented kwarg that change the interpretation of y_true and y_pred ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good to me. I'll work on that!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I understood your comment but now I'm not so sure..

Would the new dprime() code (https://github.com/dicarlolab/bangmetric/blob/6d48df3ea74131fb712aaf434b064fb0c287b387/bangmetric/dprime.py) be okay for you?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if you like the idea about pos and neg kwargs, then what about migrating the following code in dprime_from_confusion_matrix() into dprime() as well?

TPR = TP / P
FPR = FP / N
dp = np.clip(norm.ppf(TPR) - norm.ppf(FPR), min_value, max_value)

After that, dprime() will have three different modes:

  • y_pred and y_true
  • pos and neg
  • TPR and FPR

--- and of course, dprime_from_confusion_matrix() will be a small wrapper function if we do that.

"""Computes the d-prime sensitivity index of the predictions.

Parameters
Expand All @@ -23,10 +28,14 @@ def dprime(y_pred, y_true):
y_pred: array, shape = [n_samples]
Predicted values (real).

kwargs: named arguments, optional
Passed to ``dprime_from_samp()``.

Returns
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dprime_from_samp will not be visible when doing from bangmetric import *

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gosh, you're right. Thanks for the note!

-------
dp: float or None
d-prime, None if d-prime is undefined
d-prime, None if d-prime is undefined and raw d-prime value (``safedp=False``)
is not requested (default).

References
----------
Expand All @@ -45,18 +54,145 @@ def dprime(y_pred, y_true):
assert y_pred.ndim == 1

# -- actual computation
pos = y_true > 0
neg = ~pos
pos_mean = y_pred[pos].mean()
neg_mean = y_pred[neg].mean()
pos_var = y_pred[pos].var(ddof=1)
neg_var = y_pred[neg].var(ddof=1)
i_pos = y_true > 0
i_neg = ~i_pos

pos = y_pred[i_pos]
neg = y_pred[i_neg]

dp = dprime_from_samp(pos, neg, bypass_nchk=True, **kwargs)
return dp


def dprime_from_samp(pos, neg, maxv=None, minv=None, safedp=True, bypass_nchk=False):
"""Computes the d-prime sensitivity index from positive and negative samples.

Parameters
----------
pos: array-like
Positive sample values (e.g., raw projection values of the positive classifier).

neg: array-like
Negative sample values.

maxv: float, optional
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_value : None or float, optional

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Maximum possible d-prime value. If None (default), there's no limit on
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we should use np.inf instead of None

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

the maximum value.

minv: float, optional
Minimum possible d-prime value. If None (default), there's no limit.

safedp: bool, optional
If True (default), this function will return None if the resulting d-prime
value becomes non-finite.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it raise an exception instead? Or let the user deal with it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I inherited that from the existing code. I don't like returning None either. I think we should return dp regardless of np.inf or np.nan, and user should deal with that.


bypass_nchk: bool, optional
If False (default), do not bypass the test to ensure that enough positive
and negatives samples are there for the variance estimation.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any context where this flag would be used? Should we delegate this to the caller?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Will remove that part.


Returns
-------
dp: float or None
d-prime, None if d-prime is undefined and raw d-prime value (``safedp=False``)
is not requested (default).

References
----------
http://en.wikipedia.org/wiki/D'
"""

pos = np.array(pos)
neg = np.array(neg)

if not bypass_nchk:
assert pos.size > 1, 'Not enough positive samples to estimate the variance'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be an exception

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

assert neg.size > 1, 'Not enough negative samples to estimate the variance'

pos_mean = pos.mean()
neg_mean = neg.mean()
pos_var = pos.var(ddof=1)
neg_var = neg.var(ddof=1)

num = pos_mean - neg_mean
div = np.sqrt((pos_var + neg_var) / 2.)
if div == 0:

# from Dan's suggestion about clipping d' values...
if maxv is None:
maxv = np.inf
if minv is None:
minv = -np.inf
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be their default value?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.


dp = np.clip(num / div, minv, maxv)

if safedp and not np.isfinite(dp):
dp = None

return dp


def dprime_from_confusion_ova(M, fudge_mode=DEFAULT_FUDGE_MODE, \
fudge_fac=DEFAULT_FUDGE_FACTOR):
"""Computes the one-vs-all d-prime sensitivity index of the confusion matrix.

Parameters
----------
M: array, shape = [n_classes (true), n_classes (pred)]
Confusion matrix, where the element M_{rc} means the number of
times when the classifier guesses that a test sample in the r-th class
belongs to the c-th class.

fudge_fac: float, optional
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fudge_factor

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

A small factor to avoid non-finite numbers when TPR or FPR becomes 0 or 1.

fudge_mode: str, optional
Determins how to apply the fudge factor
'always': always apply the fudge factor
'correction': apply only when needed

Returns
-------
dp: array, shape = [n_classes]
Array of d-primes, each element corresponding to each class

References
----------
http://en.wikipedia.org/wiki/D'
http://en.wikipedia.org/wiki/Confusion_matrix
"""

M = np.array(M)
assert M.ndim == 2
assert M.shape[0] == M.shape[1]

P = np.sum(M, axis=1) # number of positives, for each class
N = np.sum(P) - P

TP = np.diag(M)
FP = np.sum(M, axis=0) - TP

if fudge_mode == 'always': # always apply fudge factor
TPR = (TP.astype('float') + fudge_fac) / (P + 2.*fudge_fac)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float32 or float64 ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float64

FPR = (FP.astype('float') + fudge_fac) / (N + 2.*fudge_fac)

elif fudge_mode == 'correction': # apply fudge factor only when needed
TP = TP.astype('float')
FP = FP.astype('float')

TP[TP == P] = P[TP == P] - fudge_fac # 100% correct
TP[TP == 0] = fudge_fac # 0% correct
FP[FP == N] = N[FP == N] - fudge_fac # always FAR
FP[FP == 0] = fudge_fac # no false alarm

TPR = TP / P
FPR = FP / N

else:
dp = num / div
assert False, 'Not implemented'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise NotImplementedError("message")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


dp = norm.ppf(TPR) - norm.ppf(FPR)
# if there's only two dp's then, it's must be "A" vs. "~A" task. If so, just give one value
if len(dp) == 2:
dp = np.array([dp[0]])

return dp