01:mpileup文件准备
samtools mpileup -d 1000 -Q 20 -q 30 -f /data2/references/Homo_sapiens/hg38.genomic.fa pa01_tumor1.bam >pa01_tumor1.mpileup
02:使用Varscan2的 somatic 命令
USAGE: java -jar VarScan.jar somatic [normal_pileup] [tumor_pileup] [output]
OPTIONS
normal_pileup - The SAMtools pileup file for Normal
tumor_pileup - The SAMtools pileup file for Tumor
output - Output base name for SNP and indel output
03:somatic 命令 可选参数
OPTIONS:
--output-snp - Output file for SNP calls [default: output.snp]
--output-indel - Output file for indel calls [default: output.indel]
--min-coverage - Minimum coverage in normal and tumor to call variant [8]
--min-coverage-normal - Minimum coverage in normal to call somatic [8]
--min-coverage-tumor - Minimum coverage in tumor to call somatic [6]
--min-var-freq - Minimum variant frequency to call a heterozygote [0.10]
--min-freq-for-hom Minimum frequency to call homozygote [0.75]
--normal-purity - Estimated purity (non-tumor content) of normal sample [1.00]
--tumor-purity - Estimated purity (tumor content) of tumor sample [1.00]
--p-value - P-value threshold to call a heterozygote [0.99]
--somatic-p-value - P-value threshold to call a somatic site [0.05]
--strand-filter - If set to 1, removes variants with >90% strand bias
--validation - If set to 1, outputs all compared positions even if non-variant
Variant Calling and Comparison
If tumor matches normal:
If tumor and normal match the reference
==> Call Reference
Else tumor and normal do not match the reference
==> Call Germline
Else tumor does not match normal:
Calculate significance of allele frequency difference by Fisher's Exact Test
If difference is significant (p-value < threshold):
If normal matches reference
==> Call Somatic
Else If normal is heterozygous
==> Call LOH
Else normal and tumor are variant, but different
==> Call IndelFilter or Unknown
Else difference is not significant:
Combined tumor and normal read counts for each allele. Recalculate p-value.
==> Call Germline
variant_p_value:Variant p-value for Germline events somatic_p_value:Somatic p-value for Somatic/LOH events
Isolating Calls by Type and Confidence
The latest release of VarScan includes a new (undocumented) subcommand that will separate a somatic output file by somatic_status (Germline, Somatic, LOH). Somatic mutations will further be classified as high-confidence (.hc) or low-confidence (.lc).
The command: java -jar VarScan.jar processSomatic [output.snp]
The above command will produce 4 output files:
output.snp.Somatic.hc (high-confidence Somatic mutations)
output.snp.Somatic.lc (low-confidence Somatic mutations)
output.snp.Germline (sites called Germline)
output.snp.LOH (sites called loss-of-heterozygosity, or LOH)
VarScan2 的最新版本引入了一个新的子命令,用于按变异类型(体细胞、生殖细胞、LOH)和信心级别(高信心、低信心)隔离变异调用。这将生成四个输出文件,分别包含高信心和低信心的体细胞变异,生殖细胞变异以及LOH位点。
2749

被折叠的 条评论
为什么被折叠?



