Mapping fMRI brain volumes to a joint low-dimensional space to generate descriptions of the images the subject is seeing.
You can see results here and check the numbers here. They are average.
A rough draft of what is needed to recreate the experiments. See the documentation in each file for details.
- Process the BOLD activations
- Use generate_responses.py to extract BOLD activations and ROI info from the mat file. Saved with or without temporal smoothing at delays of 4, 5, 6 and 7 seconds.
- Use generate_wu_deconv.m to estimate voxel-specific HRFs and neural responses via Wiener deconvolution (as in Wu et al., 2013).
- Create the image vectors from the experimental stimuli
- Use create_stim_images.py to put all stimuli images (from Stimuli.mat) into a single folder (/images) with names 'img_xx.png' for xx = 1 ... n
- Use img2vec.lua to create the feature vector representation of each image in the folder /images.
- Use generate_feats.py to generate the averaged and convolved feature representations for each second of stimuli.
- Fit the models
- Use train_models.py to fit the different models.
- Predict the sentences
- Use vec2seq.lua to create the sentences for each predicted image vector
Done!
- Spatial and temporal smoothing of the BOLD signal will help with the noise. Probably use a 4-D Gaussian smoothing filter.
- Estimate the HRF with flexible models (e.g., smooth FIR or logit) to account for different BOLD profiles across brain regions.
- Use attentional image caption models (such as Xu et al., 2016). Attentional models generate better captions, use lower-level CNN features which are probably easier to pick up from brain volumes and could learn to attend to features who are correctly predicted from brain volumes ignoring the hard ones.
- Use the studyforrest fMRI data set, which has natural images with better resolution, over twenty subjects and fMRI with whole-brain coverage.