文库构建前,核酸经过随机打断,有的本身就长短不一(mRNA),因此接头之间片段长度也长短不一,而二代测序的测序长度一般是固定,肯定会有部分短于测序读长的序列被测序,因此测序序列中包含了部分或全部接头序列,需要进行接头序列的检测并过滤掉对应的reads或截掉接头序列。
Paired End:
You often don’t need leading and traling clipping. Also in general keepBothReads can be useful when working with paired end data, you will keep even redunfant information but this likely makes your pipelines more manageable. Note the additional :2 in front of keepBothReads this is the minimum adapter length in palindrome mode, you can even set this to 1. (Default is a very conservative 8)
java -jar trimmomatic-0.39.jar PE input_forward.fq.gz input_reverse.fq.gz output_forward_paired.fq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:keepBothReads LEADING:3 TRAILING:3 MINLEN:36
for reference only (less sensitive for adapters)
java -jar trimmomatic-0.35.jar PE -phred33 input_forward.fq.gz input_reverse.fq.gz output_forward_paired.fq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
•Remove adapters (ILLUMINACLIP:TruSeq3-PE.fa:2:30:10)
•Remove leading low quality or N bases (below quality 3) (LEADING:3)
•Remove trailing low quality or N bases (below quality 3) (TRAILING:3)
•Scan the read with a 4-base wide sliding window, cutting when the average quality per base drops below 15 (SLIDINGWINDOW:4:15)
•Drop reads below the 36 bases long (MINLEN:36)
Single End:
java -jar trimmomatic-0.35.jar SE -phred33 input.fq.gz output.fq.gz ILLUMINACLIP:TruSeq3-SE:2:30:10 LEADING:3 TRAILING:3 SLIDINGWIN

Trimmomatic是一个用于二代测序数据预处理的工具,主要用于去除接头序列、引物序列以及低质量碱基。它采用两种策略:简单模式和回文模式,对PE和SE测序数据进行处理。通过设定不同的参数,如ILLUMINACLIP、LEADING、TRAILING、SLIDINGWINDOW和MINLEN,可以灵活地过滤接头污染和低质量序列。在回文模式下,Trimmomatic能识别并去除短至1bp的接头序列,提高数据质量。在实际应用中,调整参数如设置keepBothReads为true,可以显著增加保留的配对读对比例。
最低0.47元/天 解锁文章
4319

被折叠的 条评论
为什么被折叠?



