混合双端、V3-V4区域测序,
00.RawData已经进行了样本拆分、barcode去除和引物切除。每个样本文件夹里有5个文件,第一个extendedfrags.fastq文件是拼接后的序列,raw_1fq.gz和raw_2.fq.gz是未去barcode和引物的双端序列;最后两文件是去掉引物和barcode后的原始数据。
extendedFrags.fastq文件是由flash软件合并双端序列(即reads拼接)所得。
处理过程:
1. 导入数据
1)创建文件列表seq-list.tsv文件(必须用绝对路径)
sample-id absolute-filepath
A1 $PWD/data/A1_16S.fastq
A2 $PWD/data/A2_16S.fastq
A3 $PWD/data/A3_16S.fastq
2)导入数据
qiime tools import \
--type 'SampleData[SequencesWithQuality]' \
--input-path seq-list.tsv \
--output-path seqs.qza \
--input-format SingleEndFastqManifestPhred33V2
2. 按测序碱基质量过滤序列,得到Clean Data
qiime quality-filter q-score \
--i-demux seqs.qza \
--o-filtered-sequences demux-filtered.qza \
--o-filter-stats demux-filter-stats.qza
###Saved SampleData[SequencesWithQuality] to: demux-filtered.qza
###Saved QualityFilterStats to: demux-filter-stats.qza
3.质量控制和生成特征表(使用deblur或vsearch)
1)deblur降噪16S(自带去嵌合体功能)
deblur在denoising时需要输入整齐一样长度的序列,所以需要trim成相同的长度。
deblur的开发者们建议设置一个质量分数开始迅速下降的长度。(recommend setting this value to a length where the median quality score begins to drop too low)
qiime deblur denoise-16S \
--i-demultiplexed-seqs demux-filtered.qza \
--p-trim-length 120 \
--o-representative-sequences new-seqs.qza \
--o-table new-table.qza \
--p-sample-stats \
--o-stats deblur-stats.qza
###Saved FeatureTable[Frequency] to: new-table.qza
###Saved FeatureData[Sequence] to: new-seqs.qza
###Saved DeblurStats to: deblur-stats.qza
2)Vsearch
qiime vsearch dereplicate-sequences \
--i-sequences demux-filtered.qza \
--o-dereplicated-table new-table.qza \
--o-dereplicated-sequences new-seqs.qza
###Saved FeatureTable[Frequency] to: new-table.qza
###Saved FeatureData[Sequence] to: new-seqs.qza
4. 生成OTU
1) close referenced
#将参考数据库rep_set/97_otus.fasta转成qza格式
qiime tools import \
--input-path rep_set/97_otus.fasta \
--output-path 97_otus.qza \
--type 'FeatureData[Sequence]'
#Imported rep_set/97_otus.fasta as DNASequencesDirectoryFormat to 97_otus.qza
qiime vsearch cluster-features-closed-reference \
--i-table new-table.qza \
--i-sequences new-seqs.qza \
--i-reference-sequences 97_otus.qza \
--p-perc-identity 0.97 \
--o-clustered-table table-cr-97.qza \
--o-clustered-sequences seqs-cr-97.qza \
--o-unmatched-sequences unmatched-cr-97.qza
#Saved FeatureTable[Frequency] to: table-cr-97.qza
#Saved FeatureData[Sequence] to: seqs-cr-97.qza
#Saved FeatureData[Sequence] to: unmatched-cr-97.qza
2) denovo
qiime vsearch cluster-features-de-novo \
--i-table new-table.qza \
--i-sequences new-seqs.qza \
--p-perc-identity 0.99 \
--o-clustered-table table-dn-99.qza \
--o-clustered-sequences rep-seqs-dn-99.qza
3) open referenced
qiime vsearch cluster-features-open-reference \
--i-table new-table.qza \
--i-sequences new-seqs.qza \
--i-reference-sequences 97_otus.qza \
--p-perc-identity 0.97 \
--o-clustered-table table-or-97.qza \
--o-clustered-sequences rep-seqs-or-97.qza \
--o-new-reference-sequences new-ref-seqs-or-97.qza
注:使用vsearch合并样本
创建文件列表seq-list.tsv文件
sample-id forward-absolute-filepath reverse-absolute-filepath
A1 $PWD/data/A1_16S_R1.fastq $PWD/data/A1_16S_R2.fastq
A2 $PWD/data/A2_16S_R1.fastq $PWD/data/A2_16S_R2.fastq
A3 $PWD/data/A3_16S_R1.fastq $PWD/data/A3_16S_R2.fastq
合并:
qiime vsearch join-pairs \
--i-demultiplexed-seqs primer-trimmed-demux.qza \
--p-threads 4 \
--o-joined-sequences demux-joined.qza
#Qiime2 2022.11及更新版本,需使用
qiime vsearch merge-pairs \
--i-demultiplexed-seqs primer-trimmed-demux.qza \
--p-threads 4 \
--o-joined-sequences demux-joined.qza
#查看合并结果
qiime demux summarize \
--i-data demux-joined.qza \
--o-visualization demux-joined.qzv