Understanding how crossnobis distance metric is derived #423
Replies: 5 comments 1 reply
-
The diagonal is always set to 0, as the interpretation of noise on the diagonal is not obvious. |
Beta Was this translation helpful? Give feedback.
-
Yes, the distance between condition i and condition i is always zero, even for cross-validated estimates (we do not set the diagonal artificially to zero. It's pretty easy to see from the math. Check the following paper: For the Eucledian distances (Eq. 1), the \delta (difference between conditons) is always zero if the condition is the same. Joern |
Beta Was this translation helpful? Give feedback.
-
Thank you, that cleared things up! |
Beta Was this translation helpful? Give feedback.
-
Hi Joern and Jasper, I hope you had a wonderful new year! From the section of your paper that describes cross-validated distances, I understand now that the distance between two identical images (diagonal values) must be 0, by definition. However, I'm wondering what your thoughts are on estimating the distances across partition, as follows: Originally, the equation for the c.v. Euclidean distances is written as where If i = j, then But, if we take differences across partitions, we can rewrite the equation as: Note that I swapped the positions of m and n. While the distances between identical images should still be a small value, it is now non-zero, by definition. Our hope is that this will capture the variability in fMRI noise. Then for statistical testing, we could do a t-test against this non-zero diagonal value, instead of against 0 as rsatoolbox documentation suggests (https://rsatoolbox.readthedocs.io/en/stable/distances.html#crossnobis-dissimilarity). What do you think about this alternative approach to generate a "lower ceiling bound" for neural distance estimation? I have tried this, and the RDM (even the off-diagonal values) is less clear/interpretable. Why do you think this might be? Intuitively the math seems similar to me, other than implementing a cross-validation between partitions of unequal measurement repetitions. Thank you so much for your help. |
Beta Was this translation helpful? Give feedback.
-
Hi @ahachisuka , I've converted this issue to a "discussion" as it is more about RSA methodology than a software issue. |
Beta Was this translation helpful? Give feedback.
-
Hello,
I have a question about the crossnobis distance metric from the toolbox. I am actually not using the covariance matrix, so "noise" is just the identity matrix and the distance is simply cross-validated Euclidean distance.
Conceptually, I understand the crossvalidation procedure to mean: If we have k runs, average across (k-1) runs and calculate the distance with the left-out kth run. Repeat iteratively and average across runs.
Code where this is implemented (in rdm/calc.py):
for i_fold, fold in enumerate(cv_folds):
data_test = datasetCopy.subset_obs(cv_descriptor, fold)
data_train = datasetCopy.subset_obs(
cv_descriptor,
np.setdiff1d(cv_folds, fold)
)
measurements_train, _, _ =
average_dataset_by(data_train, descriptor)
measurements_test, _, _ =
average_dataset_by(data_test, descriptor)
rdm = _calc_rdm_crossnobis_single(
measurements_train, measurements_test, noise)
The _calc_rdm_crossnobis_single function is as follows:
def _calc_rdm_crossnobis_single(meas1, meas2, noise) -> NDArray:
kernel = meas1 @ noise @ meas2.T
rdm = np.expand_dims(np.diag(kernel), 0) +
np.expand_dims(np.diag(kernel), 1) - kernel - kernel.T
return extract_triu(rdm) / meas1.shape[1]
How does this map onto the Euclidean distance formula, which seems to be in this form: dEuc = ||x||^2 + ||y||^2 - 2xy?
Because using crossvalidation (with some fMRI noise across runs, taking the distance between the averaged (k-1) runs and the left-out kth run), I had not expected the distances between identical images to be 0 but they are.
I have also written out matrix multiplication in a for loops to help with my understanding, but this further proves the point that, somehow, the code gives me zero values in the diagonal.
#expanded for-loops:
for i in range(meas1.shape[0]):
for j in range(meas2.shape[0]):
kernel[i,j] = np.dot(np.dot(meas1[i],noise),meas2[j])
rdm = np.zeros((meas1.shape[0],meas2.shape[0]))for i in range(meas1.shape[0]):
rdm = np.zeros((meas1.shape[0],meas2.shape[0]))
for i in range(meas1.shape[0]):
for j in range(meas2.shape[0]):
rdm[i,j] = kernel[i,i] + kernel[j,j] - kernel[i,j] - kernel[j,i]
Which distance metric is being used for crossnobis/cross-validated Euclidean? How does cross-validation achieve zero values in the diagonal; shouldn't it contain non-zero values due to fMRI noise?
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions