今天很2b地用perl自己写了个统计fasta格式数据量的script
#!/usr/bin/perl -w
# Program name: detectDataNum.pl
# Author : SunChen
# Contact : bbsunchen@gmail.com
# Date : 04/21/2011
# Last Update : 04/21/2011
# Reference : Please cite our following papers when you are using this script.
# None
#
# Description : connect 2 meta-pair files to 1 file for the PE assembler.
#===========================================================================
use warnings;
use strict;
use Getopt::Long;
my %opts;
GetOptions(\%opts,"f:s");
my $usage= <<"USAGE";
Program : $0
INPUT:
-f blabla...
USAGE
die $usage unless $opts{f};
open DATA, "< $opts{f}" or die "Can't open file ".$!;
my $lines = 0;
while(<DATA>)
{
my $data = $_;
chomp($data);#deal with \n
$data=~s/\r//g;#deal with \r
next unless($data=~/\S+/); #deal with blank line here.
if($data =~ m/^>/)
{
$lines++;
}
}
close DATA;
print "data num is $lines /n";
实际上只要用一句话就行...
grep -c "^>" s1.fagrep -c "^>" s1.fa s2.fa 命令得到,比如分别为100和200
作者使用Perl编写了一个简单的脚本来统计FASTA格式文件中的序列数量,并对比了一种更简洁的方法:使用grep命令行工具。
6676

被折叠的 条评论
为什么被折叠?



