【问题】
I have one file that looks like this
>Unc14086 AGAGUUUGAU >Unc35443 GCACGAGAAA
So, every n (n may vary) lines the next line starts with “>”, that is the beginning of a new block of information.
I have another tab-delimited file:
Unc14086 InformationalTextExample Unc35443 InformationalTextExampleII
My goal is to parse the second file with information found in lines starting with “>” in the first file. Whenever a matching pair occurs, i want to write “InformationalTextExample” in that line, possibly separated by “_”:
>Unc14086_InformationalTextExample AGAGUUUGAU >Unc35443_InformationalTextExampleII GCACGAGAAA
How would that be possible?
Thank you!
【回答】
Perl 的解法虽然结构清晰,但脚本还是太长了.这类结构化计算用集算器的循环函数会比较简单,SPL 如下:
| A | |
| 1 | =file("one.txt").read@n() |
| 2 | =file("another.txt").import() |
| 3 | =A1.(if(left(~,1)!=">",~,A2.select@1(mid(A1.~,2)==_1).(">"+_1+"_"+_2))) |
更多关于循环函数的详细用法可参考【SQL 难点解决:循环计算】。
该问题涉及将两个文件中的信息进行匹配和合并。第一个文件包含以'>'开始的行,第二个文件是tab分隔的。目标是找到两文件中对应行的匹配项,并在第一个文件中加入来自第二个文件的信息。提供的解决方案建议使用Perl或者SPL的循环函数来实现这一操作。
686

被折叠的 条评论
为什么被折叠?



