Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multicollinearity #413

Open
2 tasks
rob-luke opened this issue Nov 16, 2021 · 2 comments
Open
2 tasks

Multicollinearity #413

rob-luke opened this issue Nov 16, 2021 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@rob-luke
Copy link
Member

I recently saw an insightful presentation from @helenacockx which described how multicollinearity/collinearity (wikipedia) can affect fNIRS GLM analysis. MNE-NIRS provides support for both averaging and GLM analysis. The issue of collinearity in fNIRS GLM analysis has been discussed in the literature, "a challenge of these regressor models is collinearity introduced between the task and nuisance regressors, which can happen if the systemic physiological response is correlated with the performance of the task. Collinearity in the regression analysis can destabilize it due to poor mathematical conditioning of the model and can produce unpredictable results." (Santosa et al 2020). Specifically, this is of concern as short channels (designed to measure systemic activity, not neural activity) included in the design matrix to remove systemic contributions to the signal can be highly correlated with the task regressor.

Relevant reading:

  • https://dartbrains.org/content/GLM_Single_Subject_Model.html#multicollinearity
  • Santosa 2020 used PCA to remove colinearity between the nuisance regressors, but this did not solve the issue of colinearity between short channels and the task regressor. "This was only used to remove collinearity from using multiple short-seperation channels in the model, but did not reduce any collinearity between the task regressors and the short-seperation nuisance regressors."
  • Need to find more articles on the topic

Tasks

  • Add metrics to quantify colinearity in the design matrix
  • Implement methods to mitigate the problem of colinearity

References

Hendrik Santosa, Xuetong Zhai, Frank Fishburn, Patrick J. Sparto, Theodore J. Huppert, "Quantitative comparison of correction techniques for removing systemic physiological signal in functional near-infrared spectroscopy studies," Neurophoton. 7(3) 035009 (23 September 2020)

@rob-luke rob-luke added enhancement New feature or request help wanted Extra attention is needed labels Nov 16, 2021
@helenacockx
Copy link

Thanks for picking this up Rob! The article of Santosa is indeed the only fNIRS-related paper that I found on this topic. In this paper, they propose to solve the issue of collinearity by performing a regularized mixed-effects model estimation. However, they concluded that the mixed-effect model only showed a slight improvement in performance (type-I errors) compared to the AR-IRLS model, this came at a cost of a 10-fold computation time.

I found that this article also gave insight into the concept of multicollinearity and why not solve it with orthogonalization: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4412813/

@robertoostenveld asked two of our colleagues of the Donders Institute with more fMRI-expertise how to deal with this issue. They described a two-step approach (so first regressing out the short channels and then performing the GLM without nuisance regressors) as a conservative but robust method (so removing type I errors at the cost of possible type 2 errors). "The alternative is to better understand the noise, the colinearity, try to avoid it by changing the task design (not possible here anymore) and apply a more optimized but also more liberal (i.e. more false alarms) joint fit. One of the colleagues also came up with the solution to use model comparison, which goes in the direction of Bayesian stats."

@helenacockx
Copy link

I also have been thinking about a metric to quantify collinearity. The Variance Inflation Factor (VIF) is often used in fMRI studies and describes how much the variance of one of the estimated regression coefficients is inflated by the existence of correlation among the predictor variables in the model (https://online.stat.psu.edu/stat501/lesson/12/12.4).
If I understand it correctly, we are mostly interested in how much the task regressor can be explained by the short channels (and less how much the short channels are correlated to each other because we are not interested in the betas of the nuisance regressors), so it might be enough to only calculate the VIF of the task regressor. However, this is calculated for each model/subject, so it is not clear to me how to deal with this VIF on the group level.
Furthermore, when using this VIF, we should think carefully about the threshold: https://link.springer.com/article/10.1007%2Fs11135-006-9018-6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants