Integrative Genomics Viewer (IGV)

本文档详细介绍了IGV工具命令行版本的使用方法,包括下载安装、启动方式、内存设置等,并针对toTDF、Count、Index及Sort等核心命令进行了说明,帮助用户高效处理基因组数据。

http://software.broadinstitute.org/software/igv/igvtools_commandline

Running igvtools from the Command Line

Downloading igvtools

The igvtools utilities can be downloaded from the Downloads page on the IGV Web site.

          igvtools_<version #>.zip includes the jar file and shell scripts for running igvtools, as well as the genome files.
          igvtools_nogenomes_<version #>.zip includes the jar file and shell scripts and shell scripts for running igvtools.

Starting with shell scripts

The igvtools utilities can be invoked, with or without the graphical user interface (GUI), from one of the following scripts:

   igvtools (command-line version for linux and  Mac OS 10.x)
   igvtools_gui (gui version for linux and  Mac OS 10.x)

   igvtools.bat (command-line version for windows)
   igvtools_gui.bat (gui version for windows)

The general form of the command-line version is:

   igvtools [command] [options][arguments]
or
   igvtools.bat [command] [options][arguments]

Recognized commands, options, arguments, and file types are described below.

Starting with java

Igvtools can also be started directly using Java.  This option allows more control over Java parameters, such as the maximum memory to allocate.  In this example, igvtools is started with 1500 MB of memory allocated:

   java -Xmx1500m  -Djava.awt.headless=true -jar igvtools.jar [command] [options][arguments]

To start with a GUI the command is

   java -Xmx1500m  -jar igvtools.jar -g

Memory settings

The scripts above allocate分配 a fixed amount of memory.  If this amount is not available on your platform you will get an error along the lines of "Could not start the Virtual Machine".   If this happens you will need to edit the scripts to reduce the amount of memory requested,  or use the Java startup option.  The memory is set via a "-Xmx" parameter. For example  -Xmx1500m requests 1500 MB,  -Xmx1g requests 1 gigabyte.

Genome

The genome argument in the toTDF and count command can be either an id, or a full path to a .chrom.sizes or an IGV .genome file. 
 

Commands

toTDF

The toTDF command converts转化 a sorted data input file to a binary tiled data (.tdf) file. Use this command to pre-process large datasets for improved IGV performance. 

Supported input file formats are: .wig, .cn, .snp, .igv, and .gct.

Note: This tool was previously known as "tile"

Usage:

          igvtools toTDF [options]  [inputFile] [outputFile] [genome]

Required arguments:

          inputFile    The input file (see supported formats below).

          outputFile   Binary output file.  Must end in ".tdf".

          genome      A genome id or path to a .chrom.sizes or .genome file.  Default is hg18.

Options:

 -z num  Specifies the maximum zoom level缩放级别 to precompute预计算. The default
               value is 7 and is sufficient for most files. To reduce file
               size at the expense of IGV performance this value can be
               reduced.

  -f  list     A comma delimited list specifying window functions to use
               when reducing the data to precomputed tiles.   Possible
               values are min, max, and mean.  By default only the mean
               is calculated.

  -p file    Specifies a "bed" file to be used to map probe identifiers
               to locations.  This option is useful when preprocessing . gct
               files.  The bed file should contain 4 columns:
                           chr start end name
               where name is the probe name in the .gct file.

Example:

          igvtools toTDF -z 5  copyNumberFile.cn copyNumberFile.tdf hg18

Notes:

Data file formats, with the exception of .gct files, must be sorted by start position.  Files can be sorted with the sort command described below.  Attempting to preprocess an unsorted file will result in an error.

Count

The count command computes average feature density平均密度特征 over a specified window size across the genome. Common usages include computing coverage for alignment files and counting hits in Chip-seq experiments. By default, the resulting file will be displayed as a bar chart when loaded into IGV.

Supported input file formats are: .sam, .bam, .aligned, .psl, .pslx, and .bed.

Usage:

          igvtools count [options] [inputFile] [outputFile] [genome]

Required arguments:

          inputFile    The input file (see supported formats above).

          outputFile   The output file, which can be binary "tdf" or ascii "wig" format. The filename must end in ".tdf" or ".wig", or be the special string "stdout". To indicate that you want to output both a .tdf and a .wig file, list both output filenames as a single string, separated by a comma with no other delimiters. If the output file is named "stdout" the output will be written to the standard output stream in wig format.

          genome      A genome id or path to a .chrom.sizes or .genome file.  Default is hg18.

Options:

-z, --maxZoom num

Specifies the maximum zoom level to precompute.

-w, --windowSize num

The window size over which coverage is averaged. Defaults to 25 bp.

 -e, --extFactor num

The read or feature is extended by the specified distance in bp prior to counting. This option is useful for chip-seq and rna-seq applications. The value is generally set to the average fragment length of the library minus the average read length.

--preExtFactor num

The read is extended upstream from the 5' end by the specified distance.

--postExtFactor num

Effectively overrides the read length, defines the downstream extent from the 5' end. Intended for use with preExtFactor.

-f, --windowFunctions list

A comma delimited list specifying window functions to use when reducing the data to precomputed tiles. Possible values are min, max, mean, median, p2, p10, p90, and p98. The "p" values represent percentile, so p2=2nd percentile, etc.

--strands [arg]

By default, counting is combined among both strands. This setting outputs the count for each strand separately. Legal argument values are 'read' or 'first'. 'read' Separates count by 'read' strand, 'first' uses the first in pair strand".  Results are saved in a separate column for .wig output, and a separate track for TDF output.

--bases

Count the occurrence of each base (A,G,C,T,N). Takes no arguments. Results are saved in a separate column for .wig output, and a separate track for TDF output.

--query [querystring]

Only count a specific region. Query string has syntax <chr>:<start>-<end>. e.g. chr1:100-1000. Input file must be indexed. 

--minMapQuality [mqual]

Set the minimum mapping quality of reads to include. Default is 0.

--includeDuplicates

Include duplicate alignments in count. Default false. If this flag is included, duplicates are counted. Takes no arguments

--pairs

Compute coverage from paired alignments counting the entire insert as covered. When using this option only reads marked "proper pairs" are used.

Notes:

The input file must be sorted by start position. See the sort command below.

Example:
          igvtools count -z 5 -w 25 -e 250 alignments.bam  alignments.cov.tdf  hg18

Index

Creates an index for an alignment or feature file. Index files are required for loading alignment files into IGV, and can significantly improve performance for large feature files. Note that the index file is not directly loaded into IGV. Rather, IGV looks for the index file when the alignment or feature file is loaded. This command does not take an output file argument. Instead, the filename is generated by appending ".sai" (for alignments) or ".idx" (for features) to the input filename as IGV relies on this naming convention to find the index . The input file must be sorted by start position (see sort command, below). 

Supported input file formats are: .sam, .aligned, .vcf, .psl, and .bed.

NOTES:

  • This command will not index a binary (BAM) file.  Use the samtools package to sort and index BAM files. 
  • The "sai" index is an IGV format, it does not work with samtools or any other application.

Usage:

  igvtools index [inputFile]

Sort

Sorts the input file by start position, as required.

Supported input file formats are: .cn, .igv, .sam, .aligned, .psl, .bed, and .vcf.

NOTE:  This command does not work with BAM files.  The samtools package can be used to sort .bam files.

Usage:

          igvtools  sort [options] [inputFile]  [outputFile]

Required arguments:

          inputFile 

          outputFile 

The special string "stdout" can be used as [outputFile], in which case the output will be written to the standard output stream instead of a file.

Options:

  -t tmpdir 

Specify a temporary working directory.  For large input files this directory will be used to store intermediate results of the sort. The default is the users temp directory.

  -m maxRecords 

The maximum number of records to keep in memory during the sort. The default value is 500000. Increase this number if you receive "too many open files" errors. Decrease it if you experience "out of memory" errors.

转载于:https://www.cnblogs.com/xiaofeiIDO/p/6567084.html

### IGV软件下载、安装及使用指南 #### 软件概述 Integrative Genomics Viewer (IGV) 是一种高性能的交互式可视化工具,用于探索大型综合基因组数据。它支持多种数据类型,包括基于芯片测序、二代测序数据和基因组注释数据等[^2]。 #### 下载与安装 用户可以从 IGV 官方网站下载软件并进行安装。具体步骤如下: 1. **访问官方网站**:前往 IGV 的官方下载页面(http://www.broadinstitute.org/software/igv/)。 2. **选择版本**:根据操作系统选择合适的版本进行下载。IGV 支持 Windows、Mac 和 Linux 系统[^2]。 3. **安装程序**:下载完成后,按照安装向导完成安装过程。 对于 Linux 用户,可以通过以下命令手动安装 IGV: ```bash # 下载 IGV wget https://data.broadinstitute.org/igv/projects/downloads/<version>/IGV_Linux_<version>.tar.gz # 解压文件 tar -xvzf IGV_Linux_<version>.tar.gz # 运行 IGV cd igv ./igv.sh ``` #### 使用说明 IGV 提供了丰富的功能来帮助用户分析基因组数据。以下是其主要功能及使用方法: - **加载数据**:通过菜单栏中的“File -> Load from File”选项,可以加载本地的基因组数据文件。 - **浏览数据**:用户可以通过拖动和缩放操作查看不同区域的基因组数据。 - **远程数据加载**:支持直接从服务器加载数据,例如通过“File -> Load from Server”选项[^2]。 - **自定义显示**:用户可以根据需求调整轨道的高度、颜色和其他显示属性。 #### Web版 IGV 除了传统的桌面版 IGV,用户还可以使用基于浏览器的 IGV-Web App。该应用无需安装任何软件即可在现代浏览器中运行。用户可以直接访问托管的应用程序(https://igv.org/app),或者按照指南自行部署[^1]。 #### 技术细节 IGV-Web App 基于 igv.js 开发,提供了一种轻量级的方式来浏览和分析基因组数据。其纯客户端架构确保了数据的安全性和隐私性[^1]。 ### 注意事项 在使用 IGV 时,请确保系统满足最低硬件要求,并安装必要的 Java 运行环境(适用于桌面版)。此外,建议定期更新软件以获取最新的功能和安全补丁。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值