Go by Example: Arrays

本文介绍了Go语言中数组的基本概念、创建、初始化、访问、修改及多维数组的使用方法,并通过实例展示了如何在实际编程中应用数组。

Go语言的数组是一个定长的序列,数组包含的元素的类型相同。

package main

import "fmt"

func main() {

    // 这里我们创建了一个具有5个元素的整型(int)数组
    // 元素的数据类型和数组长度都是数组类型的一部分
    // 默认情况下,数组元素都是零值。
    // 对于整数,零值就是0
    var a [5]int
    fmt.Println("emp:", a)

    // 我们可以使用索引值(index)来设置数组元素的值,就像这样"array[index] = value"
    // 或者使用索引来获取元素值, 如 "array[index]"
    a[4] = 100
    fmt.Println("set:", a)
    fmt.Println("get:", a[4])

    // 内置的len函数返回数组的长度
    fmt.Println("len:", len(a))

    // 使用下列方法可以同时定义和初始化一个数组
    b := [5]int{1, 2, 3, 4, 5}
    fmt.Println("dcl:", b)

    // 数组都是一维的,但是你可以把数组的元素定义为一个数组
    // 来获取多维数组结构
    var twoD [2][3]int
    for i := 0; i < 2; i++ {
        for j := 0; j < 3; j++ {
            twoD[i][j] = i + j
        }
    }
    fmt.Println("2d: ", twoD)
}
输出

$ go run arrays.go
emp: [0 0 0 0 0]
set: [0 0 0 0 100]
get: 100
len: 5
dcl: [1 2 3 4 5]
2d:  [[0 1 2] [1 2 3]]
当你使用fmt.Println输出数组时候,你会发现数组会以[v1,v2,v3...]形式被打印。


在经典的Go语言中,相比与数组(Array)你会更多的遇到切片(slice)。

要了解更多关于数组,请查看学习Golang语言(5): 类型--数组

下一个章节将会讲解切片。


下一个例子:  Go by Example:  Slices.

英文原文

Step 1: Prepare the bed file and pep file of the species Firstly, please prepare bed files for all species and fasta format files for proteins. Determine a four character abbreviation for each species, for example, Arabidopsis thaliana can be set as , and Amborella trichopoda can be set as . The bed file contains the location information of cDNA, and the format requirements are as follows: the first column is the chromosome number, which needs to be preceded by the species abbreviation and the first two characters must be unique to the species (distinguished by uppercase and lowercase lettersath_atr_) bed file of Arabidopsis thaliana: aThChr1 ath_AT1G01010 3631 5899 bed file of Amborella trichopoda: Atrscaffold00001 atr_scaffold00001.1atr 3379 6049 The second column is the ID number of cDNA, which needs to include the species abbreviation, such as ath_T1G01010. The third column represents the starting position of cDNA, and the fourth column represents the ending position of cDNA. For the fasta file of proteins, please ensure that the protein ID in the file corresponds to the ID in the bed file, and change the file suffix to. pep. Ensure that the bed file and pep file with the same abbreviation, for example, the bed and pep files corresponding to Arabidopsis thaliana are ath.bed and ath.pep, respectively. Step 2: Create a name list file that includes species abbreviations and taxonomic information Fill in the names of the pep file and bed file corresponding to the species in the first column of the name list, the abbreviation of the ID in the second column of the name list, and the classification information of the species in the third, fourth, fifth, and sixth columns. for example: #file_name ID_abbreviation Clade Order Family Species_Name Mdgd Mdgd Super-Rosides Rosales Rosaceae Malus x domestica ppe ppe_ Super-Rosides Rosales Rosaceae Prunus persica ath ath_ Super-Rosides Brassicales Brassicaceae Arabidopsis thaliana Alyr Alyr Super-Rosides Brassicales Brassicaceae Arabidopsis lyrata atr atr_ Basal-Angiosperm Amborellales Amborellaceae Amborella trichopoda Step 3: Constructing a species overall collinear network Run the synetbuild module. synetbuild -i your_path/species_list -d your_pep_bed_path -o your_output_path You can input some additional options to adjust the parameters for building the network. synetbuild -i your_path/species_list -d your_pep_bed_path -o your_output_path -k 5 -s 5 -m 10 -p 4 -D -T If you enter the above parameters, a collinear network will be constructed with the top 5 results hit by blastp, Minimum 5 of Anchors for a synteny block, and Maximum 10 of Genes allowed as the GAP between Anchors as parameters, and 4 threads will be used for blastp. Inputting will run MCScanX's duplicate_gene_classifier module, and inputtingwill run the detectability collineear_tandem_arrays module, which can help you better search for tandem genes. Detailed help information can be viewed through -D-Tsynetbuild -h After the step is executed, you will receive a SynNet file, which is the total network file and will be used for subsequent analysis. Step 4: Extract the networks that interest you Extracting subnetworks using the synetfind module. Take the SynNet file obtained in the previous step as input for , use input your protein model file , and you will get a network composed of nodes hit by the model.-n-m synetfind -i your_path/species_list -m your_path/hmm_file -d your_pep_bed_path -n your_path/SynNet_file -o your_output_path You can input the threshold of E-value through to make the search conditions more stringent or relaxed (default is 0.001).-E synetfind -i your_path/species_list -m your_path/hmm_file -d your_pep_bed_path -n your_path/SynNet_file -o your_output_path -E 0.01 If you have a specific list of IDs: synetfind -l your_path/your_id_list_file -n your_path/SynNet_file -o your_output_path Detailed help information can be viewed through .synetfind -h After the step is executed, you will receive a cleaned-network file, this is a two column file composed of nodes hit by the model. At the same time, you will receive a SynNet_f file containing collinearity information for all species. Subsequent modules require the input of these two files. 此时已经进行第3步即构建共线性网络,解释得到的文件以及下一步怎么做
最新发布
09-02
<think> 我们之前讨论了`syntenybuild`命令的输出文件,现在用户要求解释这些文件并明确后续步骤。根据之前的回答,输出文件包括: 1. 共线性区块文件(如.collinearity) 2. 同源基因对应文件(如.anchors) 3. 可视化文件(如.png或.svg) 4. 统计报告(如summary_report.txt) 5. 中间文件(可选) 接下来,用户需要知道执行完这个命令后,下一步该做什么。常见的后续步骤可能包括: - 结果验证:检查共线性区块的可靠性 - 进化分析:基于共线性区块推断基因组进化事件(如倒位、 duplication) - 功能分析:结合基因注释分析保守区域的功能 - 可视化增强:使用更高级的工具(如Circos)进行可视化 - 构建基因家族:利用同源基因对构建基因家族树 另外,用户之前提到过一个关于Augustus的引用,但根据上下文,那个引用与当前问题无关,因此我们忽略它。 现在,我将为用户解释输出文件,并详细说明后续步骤。 </think> ### 执行`syntenybuild`命令后的文件解释与后续步骤 #### 一、输出文件详解 执行命令后,输出目录将包含以下核心文件: 1. **共线性区块文件 (`*.collinearity`)** - **内容**:物种间保守的基因区块坐标及对应关系 示例格式: ``` ## Block1: SpeciesA_chr1:1000-5000 (+) -- SpeciesB_chr3:2000-6000 (-) GeneA1 → GeneB3 (e-value=$1e^{-15}$) GeneA2 → GeneB2 (e-value=$5e^{-12}$) ``` - **用途**:识别基因组重排事件(如倒位、易位)[^1] 2. **同源基因锚点文件 (`*.anchors`)** - **内容**:跨物种高置信度同源基因对 ```txt SpeciesA|Gene001 SpeciesB|Gene045 score=0.98 SpeciesA|Gene005 SpeciesB|Gene128 score=0.95 ``` - **用途**:构建系统发育树的核心输入[^2] 3. **统计报告 (`summary_report.txt`)** - **关键指标**: - 共线性区块总数:$N$ - 基因组覆盖率:$\frac{\text{保守区域总长}}{\text{基因组总长}} \times 100\%$ - 同源基因密度:$\frac{\text{锚点基因数}}{\text{区块总长度}}$ 4. **可视化文件 (`*.svg`/`*.png`)** - **类型**: - 点图(Dot plot):显示基因位置对应关系 - 圈图(Circos plot):多物种基因组互作图[^3] ![共线性圈图示例](https://bioinformatics.ccr.cancer.gov/docs/glue/img/circos.png) 5. **中间文件 (`*.blast`, `*.hmm`等)** - **说明**:BLAST比对结果、HMM模型等,可用于结果验证 --- #### 二、后续关键步骤 1. **结果验证** - 使用`SynMap`工具检查区块边界合理性 - 比对HMM模型:`hmmalign ${HMM_FILE} ${SEQUENCE}` - 验证公式:保守性得分 $S_c = \sum \frac{\text{同源基因对数}}{\text{区块长度}}$ 2. **进化分析** ```bash # 基于锚点文件构建系统发育树 fasttree -gamma -out tree.newick output_path/*.anchors ``` - 分析基因组重排事件(倒位/复制)频率: $$ f_{\text{rearr}} = \frac{\text{非连续区块数}}{\text{总区块数}} $$ 3. **功能富集分析** - 提取保守区块基因:`grep "Block" *.collinearity > conserved_genes.txt` - 使用`DAVID`工具进行GO/KEGG富集分析[^4] 4. **可视化增强** - 用`Circos`定制圈图: ```ini <highlight> file = conserved_blocks.txt color = vdgrey </highlight> ``` 5. **下游应用** - 祖先基因组重建:`ANGES`工具 - 选择压力分析:`PAML`计算Ka/Ks比值 $$ \omega = \frac{K_a}{K_s} $$ ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值