CryoSTAR: Leveraging Structural Prior and Constraints for Cryo-EM Heterogeneous Reconstruction 翻译

Doc2X:智能文档解析工具
Doc2X 支持从 PDF 转换为 Docx、HTML、Markdown,功能覆盖 公式识别、代码解析、表格转换、多栏布局解析,并整合了 GPT翻译 和 Deepseek 翻译!
Doc2X: Intelligent Document Parsing Tool
Doc2X supports PDF to Docx, HTML, and Markdown, with features like formula recognition, code parsing, table conversion, and multi-column layout parsing, integrated with GPT and DeepSeek translations!
👉 了解 Doc2X 的独特功能 | Explore Doc2X Features

原文链接:https://www.biorxiv.org/content/10.1101/2023.10.31.564872v2.full.pdf

CryoSTAR: Leveraging Structural Prior and Constraints for Cryo-EM Heterogeneous Reconstruction

CryoSTAR: 利用结构先验和约束进行冷冻电子显微镜异质重建

Yilai Li 1 # {}^{1\# } 1# ,Yi Zhou 1 # {}^{1\# } 1# ,Jing Yuan 1 # {}^{1\# } 1# ,Fei Ye 1 {}^{1} 1 ,Quanquan Gu 1 ∗ {}^{ {1}^{ * }} 1

Yilai Li 1 # {}^{1\# } 1# ,Yi Zhou 1 # {}^{1\# } 1# ,Jing Yuan 1 # {}^{1\# } 1# ,Fei Ye 1 {}^{1} 1 ,Quanquan Gu 1 ∗ {}^{ {1}^{ * }} 1

'ByteDance Research

ByteDance Research

#Contributed Equally

同等贡献

'Correspondence to: quanquan.gu@bytedance.com

'通讯至: quanquan.gu@bytedance.com

Abstract

摘要

Resolving conformational heterogeneity in cryo-electron microscopy (cryo-EM) datasets remains a significant challenge in structural biology. Previous methods have often been restricted to working exclusively on volumetric densities, neglecting the potential of incorporating any pre-existing structural knowledge as prior or constraints. In this paper, we present a novel methodology, cryoSTAR, that harnesses atomic model information as structural regularization to elucidate such heterogeneity. Our method uniquely outputs both coarse-grained models and density maps, showcasing the molecular conformational changes at different levels. Validated against four diverse experimental datasets, spanning large complexes, a membrane protein, and a small single-chain protein, our results consistently demonstrate an efficient and effective solution to conformational heterogeneity with minimal human bias. By integrating atomic model insights with cryo-EM data, cryoSTAR represents a meaningful step forward, paving the way for a deeper understanding of dynamic biological processes. 1 {}^{1} 1

在冷冻电子显微镜(cryo-EM)数据集中解决构象异质性仍然是结构生物学中的一项重大挑战。以往的方法往往仅限于处理体积密度,忽视了将任何现有结构知识作为先验或约束纳入的潜力。在本文中,我们提出了一种新颖的方法论,cryoSTAR,利用原子模型信息作为结构正则化来阐明这种异质性。我们的方法独特地输出粗粒度模型和密度图,展示不同层次的分子构象变化。通过对四个不同的实验数据集进行验证,这些数据集涵盖了大型复合物、膜蛋白和小型单链蛋白,我们的结果始终展示了以最小的人为偏见有效且高效地解决构象异质性。通过将原子模型见解与冷冻电子显微镜数据相结合,cryoSTAR代表了一步重要的进展,为深入理解动态生物过程铺平了道路。 1 {}^{1} 1

Introduction

引言

Single particle cryo-electron microscopy (cryo-EM) is a structural biology tool that can directly observe the conformational heterogeneity of each biomolecule, that each dataset contains many 2D projections of 3D structures from potentially different conformational states’. Traditional algorithms (e.g., 3D classification) treat the heterogeneity in the dataset as discrete clusters and assign each particle to the best class 2 − 6 {}^{2 - 6} 26 . However,in many real datasets,heterogeneity often comes from conformational dynamics,a continuous process. Using traditional algorithms often results in the 3D density maps blurry in the flexible regions.

单颗粒冷冻电子显微镜(cryo-EM)是一种结构生物学工具,可以直接观察每个生物分子的构象异质性,每个数据集包含来自潜在不同构象状态的3D结构的许多2D投影。传统算法(例如,3D分类)将数据集中的异质性视为离散簇,并将每个粒子分配给最佳类别 2 − 6 {}^{2 - 6} 26。然而,在许多真实数据集中,异质性通常来自构象动态,这是一个连续过程。使用传统算法往往导致在柔性区域的3D密度图模糊不清。

A few algorithms in recent years have been developed to resolve continuous heterogeneity from cryo-EM datasets. For example, principal component analysis (PCA) and its variants have been used to describe the variability within the dataset, which model the heterogeneity as a linear combination of a few bases 7 − 10 {\text{bases}}^{7 - {10}} bases710 . To achieve more expressive power with nonlinearity,deep learning-based methods were developed to map such heterogeneity onto nonlinear manifold embeddings. For example,cryoDRGN 11 {}^{11} 11 and cryoDRGN2’ use a variational autoencoder (VAE) ’ Based approach to map the heterogeneity within the dataset to a latent space. A generative decoder is used to generate a 3D volume given a sampled point from the latent space. On the other hand,3DFlex 14 {}^{14} 14 explicitly models the motion of flexible regions by learning a 3D deformation field and optimizing a canonical density, while encouraging local smoothness and rigidity. Nevertheless, these methods approach the continuous heterogeneity issue solely from

近年来开发了一些算法,以解决来自cryo-EM数据集的连续异质性。例如,主成分分析(PCA)及其变体已被用于描述数据集中的变异性,它将异质性建模为少数几个的线性组合 bases 7 − 10 {\text{bases}}^{7 - {10}} bases710。为了实现更强的非线性表达能力,开发了基于深度学习的方法,将这种异质性映射到非线性流形嵌入上。例如,cryoDRGN 11 {}^{11} 11 和 cryoDRGN2 使用变分自编码器(VAE)的方法将数据集中的异质性映射到潜在空间。生成解码器用于根据从潜在空间采样的点生成3D体积。另一方面,3DFlex 14 {}^{14} 14 通过学习3D变形场并优化典型密度,同时鼓励局部平滑性和刚性,明确建模柔性区域的运动。然而,这些方法仅从


1 {}^{1} 1 The short version of this paper has been accepted by the NeurIPS workshop on New Frontiers of AI for Drug Discovery and Development.

1 {}^{1} 1 本文的简短版本已被NeurIPS关于药物发现与开发新前沿的研讨会接受。


bioRxiv preprint doi: https://doi.org/10.1101/2023.10.31.564872; this version posted December 7, 2023. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

bioRxiv 预印本 doi: https://doi.org/10.1101/2023.10.31.564872;此版本于2023年12月7日发布。该预印本的版权持有者(未经过同行评审的版本)为作者/资助者,已授予 bioRxiv 永久展示该预印本的许可。该文档根据 CC-BY-NC-ND 4.0 国际许可协议提供。

a computer vision perspective, without leveraging any prior knowledge that could be used as structural constraints.

从计算机视觉的角度出发,不利用任何可以作为结构约束的先验知识。

Some recent works tried to incorporate information from the atomic model into the pipeline, or to directly output coarse-grained (CG) atomic models for better interpretation. For example,Chen et al. 15 {}^{15}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值