[ICLR 2025]MindSimulator: Exploring Brain Concept Localization via Synthetic FMRI-优快云博客

②Limitations: limited data, bias of artificially selected stimuli, and isolated objects in unnatural scenes（这个似乎就和上一篇论文相反，上一篇觉得就应该单独看一个物体只认知这个物体，但这一篇认为物体不应该独立于场景而存在。可能就是上下文认知不同吧。）

2.3. Related Works

①Lists fMRI encoding/decoding and generative methods

2.4. Method

2.4.1. Motivation and Overview

①⭐The same stimuli will bring differet fMRI recording. 因此一个刺激会对应多个fMRI记录。通过回归模型加工的刺激只能得到单一的fMRI记录，但生成模型更为随机。所以作者采用生成模型

②Overview of model: $\left ( x,y \right )$

2.4.2. fMRI Autoencoder

①Paired data $\left ( x,y \right )$ is from sample $\mathcal{S}$ , where ${x}\in\mathbb{R}^{l}$ is preprocessed BOLD signal and $y$ is corresponding visual stimuli （从很后面返回到这里，这个一维是因为展平了）

②Voxel encoder ${\mathcal{E}}(\cdot)$ embeds $x$ to $\mathcal{X}=\mathcal{E}(x)\in\mathbb{R}^{m\times d}$ with higher dimension

③Voxel decoder ${\mathcal{D}}(\cdot)$ decodes $\mathcal{X}$ back to fMRI voxel $\hat{x}=\mathcal{D}(\mathcal{X})\in\mathbb{R}^{l}$

④Loss of autoencoder block:

$\mathcal{L}_{\mathrm{mse}}=\mathbb{E}_{x\sim\mathcal{S}}||x-\hat{x}||_{2}^{2}$

⑤They use pre-trained CLIP-ViT ${\mathcal{V}}(\cdot)$ to align fMRI and stimuli $\mathcal{Y}=\mathcal{V}(y)\in\mathbb{R}^{m\times d}$

⑥Loss of align:

$\begin{gathered} \mathcal{L}_{\mathrm{softclip}}=-\frac{1}{|\mathcal{S}|}\sum_{i=1}^{|\mathcal{S}|}\sum_{j=1}^{|\mathcal{S}|}\left[\frac{\exp(\mathcal{X}_{i}\cdot\mathcal{X}_{j}/\tau)}{\sum_{k=1}^{|\mathcal{S}|}\exp(\mathcal{X}_{i}\cdot\mathcal{X}_{k}/\tau)}\cdot\log\left(\frac{\exp(\mathcal{Y}_{i}\cdot\mathcal{X}_{j}/\tau)}{\sum_{k=1}^{|\mathcal{S}|}\exp(\mathcal{Y}_{i}\cdot\mathcal{X}_{k}/\tau)}\right)\right] \\ -\frac{1}{|\mathcal{S}|}\sum_{i=1}^{|\mathcal{S}|}\sum_{j=1}^{|\mathcal{S}|}\left[\frac{\exp(\mathcal{Y}_{i}\cdot\mathcal{Y}_{j}/\tau)}{\sum_{k=1}^{|\mathcal{S}|}\exp(\mathcal{Y}_{i}\cdot\mathcal{Y}_{k}/\tau)}\cdot\log\left(\frac{\exp(\mathcal{X}_{i}\cdot\mathcal{Y}_{j}/\tau)}{\sum_{k=1}^{|\mathcal{S}|}\exp(\mathcal{X}_{i}\cdot\mathcal{Y}_{k}/\tau)}\right)\right]. \end{gathered}$

⑦Autoencoder loss:

$\mathcal{L}_{\text{autoencoder}}=\mathcal{L}_{\mathrm{mse}}+\mathcal{L}_{\mathrm{softclip}}$

⑧Diffusion estimator ${\mathcal{P}}(\cdot)$ is Transformer with cross-attention

2.4.3. Diffusion Estimator

①They designed a diffusion estimator $\mathcal{P}(\cdot)$ with $T$ time steps to obtain the noised fMRI representation:

$\mathcal{Z}_{t}^{\mathcal{X}}=\sqrt{\bar{\alpha_{t}}}\cdot\mathcal{X}+\sqrt{1-\bar{\alpha_{t}}}\cdot\epsilon,\bar{\alpha_{t}}=\prod_{m=1}^{t}\alpha_{m},t\sim[1,2,\cdots,T]$

where $\alpha_m$ denotes noise schedule hyperparameter and $\epsilon\sim\mathcal{N}(0,1)$ denotes Gaussian noise

②Learning objective:

$\mathcal{L}_{\mathrm{diffusion}}=\mathbb{E}_{\epsilon,t,(x,y)\sim\mathcal{S}}[\|\mathcal{P}(\mathcal{Z}_{t}^{\mathcal{X}},\mathcal{Y},\mathcal{T}_{t})-\mathcal{X}\|_{2}^{2}]$

2.4.4. Inference Sampler

①Predicted fMRI representation:

$\hat{Z}_{t-1}^{\mathcal{X}}=\mathcal{P}(\hat{Z}_{t}^{\mathcal{X}},\mathcal{Y},\mathcal{T}_{t}),\hat{\mathcal{Z}}_{T}^{\mathcal{X}}\sim\mathcal{N}(0,1),\hat{\mathcal{X}}=\hat{\mathcal{Z}}_{0}^{\mathcal{X}}$

②They generate $N$ fMRI signals by $N$ noise and take the average of $N$ fMRI

③Noise generation: randomly sample two independent noise $\epsilon_{1}\sim\mathcal{N}(0,1)$ and $\epsilon_{2}\sim\mathcal{N}(0,1)$ , then generates others:

$\epsilon_n=\sqrt{\beta_n}\cdot\epsilon_1+\sqrt{1-\beta_n}\cdot\epsilon_2,n\in[1,2,\cdots,N]$

2.5. Experiments Setup

2.5.1. Datasets

①Dataset: Natural Scenes Dataset (NSD)

②Subject: 8

③Image/stimuli set: MSCOCO

④Session: 3 with 10000 images each, then obtain 30000 fMRI signals for one sbject

⑤Selected subject: 1, 2, 5, 7 for their complete experiment

⑥Data split: 9000 for training and 1000 for testing

⑦⭐作者在训练的时候将一个被试的三次session当作三个数据，但在测试时会综合一个刺激对应的三个fMRI刺激并取平均作为结果

⑧The authors utilized the GLMSingle tool to compute the beta-activations for each voxel, which reflect the strength of the brain's response to specific stimuli, and normalized these activations.

⑨Brain atlas: 官方自动分割的，不是现有的模板