Varscan2 Call snp_indel

VarScan2 的最新版本引入了一个新的子命令,用于按变异类型(体细胞、生殖细胞、LOH)和信心级别(高信心、低信心)隔离变异调用。这将生成四个输出文件,分别包含高信心和低信心的体细胞变异,生殖细胞变异以及LOH位点。

Varscan2官方文档

01:mpileup文件准备
samtools mpileup -d 1000 -Q 20 -q 30 -f /data2/references/Homo_sapiens/hg38.genomic.fa pa01_tumor1.bam >pa01_tumor1.mpileup 
02:使用Varscan2的 somatic 命令
USAGE: java -jar VarScan.jar somatic [normal_pileup] [tumor_pileup] [output] 
OPTIONS
        normal_pileup - The SAMtools pileup file for Normal
        tumor_pileup - The SAMtools pileup file for Tumor
        output - Output base name for SNP and indel output
03:somatic 命令 可选参数
OPTIONS:
	--output-snp - Output file for SNP calls [default: output.snp]
	--output-indel - Output file for indel calls [default: output.indel]
	--min-coverage - Minimum coverage in normal and tumor to call variant [8]
	--min-coverage-normal - Minimum coverage in normal to call somatic [8]
	--min-coverage-tumor - Minimum coverage in tumor to call somatic [6]
	--min-var-freq - Minimum variant frequency to call a heterozygote [0.10]
	--min-freq-for-hom	Minimum frequency to call homozygote [0.75]
	--normal-purity - Estimated purity (non-tumor content) of normal sample [1.00]
	--tumor-purity - Estimated purity (tumor content) of tumor sample [1.00]
	--p-value - P-value threshold to call a heterozygote [0.99]
	--somatic-p-value - P-value threshold to call a somatic site [0.05]
	--strand-filter - If set to 1, removes variants with >90% strand bias
	--validation - If set to 1, outputs all compared positions even if non-variant
Variant Calling and Comparison
If tumor matches normal: 
        If tumor and normal match the reference 
            ==> Call Reference 
        Else tumor and normal do not match the reference 
            ==> Call Germline 

    Else tumor does not match normal: 
    Calculate significance of allele frequency difference by Fisher's Exact Test 
        If difference is significant (p-value < threshold):
            If normal matches reference
                ==> Call Somatic
            Else If normal is heterozygous
                ==> Call LOH
            Else normal and tumor are variant, but different
                ==> Call IndelFilter or Unknown
        Else difference is not significant:
        Combined tumor and normal read counts for each allele. Recalculate p-value.
            ==> Call Germline 
variant_p_value:Variant p-value for Germline events somatic_p_value:Somatic p-value for Somatic/LOH events
Isolating Calls by Type and Confidence

The latest release of VarScan includes a new (undocumented) subcommand that will separate a somatic output file by somatic_status (Germline, Somatic, LOH). Somatic mutations will further be classified as high-confidence (.hc) or low-confidence (.lc).

The command: java -jar VarScan.jar processSomatic [output.snp]

The above command will produce 4 output files:
output.snp.Somatic.hc (high-confidence Somatic mutations)
output.snp.Somatic.lc (low-confidence Somatic mutations)
output.snp.Germline (sites called Germline)
output.snp.LOH (sites called loss-of-heterozygosity, or LOH)

data_dir='/public/work/Personal/wuxu/qiantao_17' for file1 in ${data_dir}/*.fasta; do for file2 in ${data_dir}/*.fasta; do if [ "$file1" != "$file2" ]; then touch snp_indel.end.sh && cat snp_indel.end.sh && \ export PATH=/public/work/Personal/pangshuai/software/conda/miniconda3/bin/:${PATH} && \ nucmer --mum -t 8 -g 1000 -p ${file1##*/}.${file2##*/}.ref_based.nucmer $file1 $file2 && \ delta-filter -1 -l 200 ${file1##*/}.${file2##*/}.ref_based.nucmer.delta > ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter && \ dnadiff -d ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter -p ${file1##*/}.${file2##*/}.ref_based.nucmer && \ show-coords -rcloT ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter > ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter.coords && \ show-coords -THrd ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter > ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter.syri.coords && \ show-snps -ClrTH ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter > ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter.snp && \ show-diff ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter > ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter.inv && \ perl /public/work/Pipline/Structural_Variation/pipeline/2.1.1/bin/filter_the_MUmmer_SNP_file.pl ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter.snp ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter.snp.SNPs ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter.snp.Insertions ${file1##*/}.${file2##*/}.ref_based.nucmer.delta.filter.snp.Deletions 10000000 && \ touch snp_indel.end.tmp && \ mv snp_indel.end.tmp snp_indel.end && \ sleep 10 fi done done ,增加一个判断,使/public/work/Personal/wuxu/qiantao_17路径下以.fasta结尾的文件两两一组不分前后只组合一次,然后再执行touch 后面的代码
06-03
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值