菘蓝(一种重要的中药用植物)的染色体级基因组组装:菘蓝基因组

A chromosome-scale genome assembly of Isatis indigotica, an important medicinal plant used in traditional Chinese medicine: An Isatis genome

Abstract  抽象的

Isatis indigotica (2n = 14) is an important medicinal plant in China. Its dried leaves and roots (called Isatidis Folium and Isatidis Radix, respectively) are broadly used in traditional Chinese medicine for curing diseases caused by bacteria and viruses such as influenza and viral pneumonia. Various classes of compounds isolated from this species have been identified as effective ingredients. Previous studies based on transcriptomes revealed only a few candidate genes for the biosynthesis of these active compounds in this medicinal plant. Here, we report a high-quality chromosome-scale genome assembly of Iindigotica with a total size of 293.88 Mb and scaffold N50 = 36.16 Mb using single-molecule real-time long reads and high-throughput chromosome conformation capture techniques. We annotated 30,323 high-confidence protein-coding genes. Based on homolog searching and functional annotations, we identified many candidate genes involved in the biosynthesis of main active components such as indoles, terpenoids, and phenylpropanoids. In addition, we found that some key enzyme-coding gene families related to the biosynthesis of these components were expanded due to tandem duplications, which likely drove the production of these major active compounds and explained why I. indigotica has excellent antibacterial and antiviral activities. Our results highlighted the importance of genome sequencing in identifying candidate genes for metabolite synthesis in medicinal plants.

菘蓝(Isatis indigotica,2n = 14)是中国一种重要的药用植物。其干燥的叶片和根(分别称为板蓝根和大青叶)被广泛用于传统中药中,用于治疗由细菌和病毒引起的疾病,如流感和病毒性肺炎。从这种植物中分离出的多种化合物已被鉴定为有效成分。以往基于转录组的研究仅揭示了这种药用植物中少数与这些活性化合物生物合成相关的候选基因。

在此,我们报告了菘蓝的高质量染色体水平基因组组装,总大小为2.9388亿碱基对(293.88 Mb),支架N50为3616万碱基对(36.16 Mb)。该基因组组装利用了单分子实时长读测序技术和高通通量染色体构象捕获技术。我们注释了30,323个高置信度的蛋白编码基因。基于同源基因搜索和功能注释,我们识别了许多参与主要活性成分(如吲哚类、萜类和酚丙类化合物)生物合成的候选基因。此外,我们发现一些与这些成分生物合成相关的关键酶编码基因家族因串联重复而扩张,这可能推动了这些主要活性化合物的产生,并解释了为什么菘蓝具有出色的抗菌和抗病毒活性。我们的研究结果突出了基因组测序在识别药用植物代谢物合成候选基因中的重要性。

Introduction  介绍

The plant family Brassicaceae (Cruciferae) comprises over 330 genera and ~3700 species with a worldwide distribution1–5. Numerous crops are derived from this family, including vegetables (Brassica and Raphanus), ornamentals (Matthiola, Hesperis, and Lobularia), spices (Eutrema and Armoracia), and medicines (Isatis). Based on sequenced genomes, several model species have been developed for diverse studies, including Arabidopsis thaliana for molecular function studies, Brassica for polyploidization and whole-genome duplication (WGD) studies, and Eutrema salsugineum for abiotic tolerance-related studies. However, genetic biosynthesis of the major active compounds in medicinal plants of this family remains poorly investigated.

十字花科(Brassicaceae,又称Cruciferae)是一个全球分布的植物科,包含超过330个属和约3700种植物[^1-5^]。许多农作物都来源于这个科,包括蔬菜(如**甘蓝属**_Brassica_和**萝卜属**_Raphanus_)、观赏植物(如**香雪球属**_Matthiola_、**桂竹香属**_Hesperis_和**丝石竹属**_Lobularia_)、香料植物(如**独行菜属**_Eutrema_和**辣根属**_Armoracia_)以及药用植物(如**菘蓝属**_Isatis_)。

基于已测序的基因组,该科已开发出几种模式物种,用于多种研究。例如,**拟南芥**(_Arabidopsis thaliana_)被用于分子功能研究,**甘蓝属**(_Brassica_)被用于多倍体化和全基因组复制(WGD)研究,而**盐芥**(_Eutrema salsugineum_)被用于非生物胁迫耐受性相关研究。然而,对于十字花科药用植物中主要活性化合物的遗传生物合成,目前的研究仍然相对较少。

Isatis indigotica (2n = 14) belongs to tribe Isatideae in lineage II of the family3,6–10. This species is widely cultivated in China as an important medicinal plant because its dried leaves and roots are used as a traditional Chinese medicine for curing diseases and viruses11–13. The major active compounds isolated from this species comprise terpenoids, lignans, and indole alkaloids14–17. These compounds were confirmed to have antiviral18,19, antibacterial20, anti-inflammatory21,22, and antileukemia23,24 functions. Previous studies based on transcriptomes revealed a few candidate genes involved in the biosynthesis of active compounds in this species25–28. However, the limitations of transcriptome quality and integrity hinder the identification of all candidate biosynthesis-related genes.

菘蓝(Isatis indigotica,2n = 14)属于十字花科(Brassicaceae)菘蓝族(Isatideae),是十字花科第二谱系(lineage II)的成员。这种植物在中国被广泛栽培,是一种重要的药用植物,其干燥的叶片和根(分别称为大青叶和板蓝根)被用于传统中药,具有清热解毒、抗菌、抗病毒、抗炎和抗白血病等多种功效。

菘蓝的主要活性成分包括萜类化合物、木脂素和吲哚生物碱等,这些成分已被证实具有抗病毒、抗菌、抗炎和抗白血病等功能。此前基于转录组的研究揭示了少数与这些活性化合物生物合成相关的候选基因,但由于转录组质量和完整性有限,难以识别所有与生物合成相关的候选基因。

近年来,高质量的菘蓝基因组测序工作取得了进展。其基因组大小约为300 Mb,具有tPCK核型,包含约30,000个蛋白编码基因。这些研究揭示了菘蓝主要活性成分(如吲哚生物碱、酚丙类和萜类化合物)的合成途径及其候选基因

In the present study, we used single-molecule sequencing combined with high-throughput chromosome conformation capture (Hi-C) technology to assemble the genome and construct the pseudochromosomes of I. indigotica. Based on homolog searching and functional annotations, we aimed to identify candidate gene sets involved in the biosynthesis of putative active components. The candidate genes and genomic resources recovered here will be critically important for further experimental verification and artificial syntheses of the active compounds of this medicinal plant in the future.


本研究利用单分子测序结合高通量染色体构象捕获(Hi-C)技术对印度栲木基因组进行组装,构建假染色体。基于同源物搜索和功能注释,我们旨在确定参与假定活性成分生物合成的候选基因集。这里回收的候选基因和基因组资源对于未来进一步的实验验证和人工合成该药用植物的活性成分至关重要。

Results  结果

基因组组装和假染色体的构建Genome assembly and construction of pseudochromosomes

The genome size, genome repeat size, and heterozygosity rate of I. indigotica were estimated using K-mer analysis. The 19-mer frequency of Illumina short reads with the highest peak occurred at a depth of 94. The genome was estimated to be 279.90 Mb in size with 48.99% repeats, and the heterozygosity rate was estimated to be 0.44% (

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值