【问题】
I needed to extract all hits from one list (list.txt) which can be found in one of the columns of another (here in Data.txt) into a third (output.txt).
Data.txt (tab delimited)
some\_data more\_data other\_data here yet\_more_data etc
A B 2 Gee;Whiz;Hello 13 12
A B 2 Gee;Whizz;Hi 56 32
E 4 Btm;Lol 16 2
T 3 Whizz 13 3
List.txt
Gee
Whiz
Lol
Ideally output.txt looks like
some\_data more\_data other\_data here yet\_more_data etc
A B 2 Gee;Whiz;Hello 13 12
A B 2 Gee;Whizz;Hi 56 32
E 4 Btm;Lol 16 2
So I tried a shell script
for ids in List.txt
do
grep $ids Data.txt >> output.txt
done
except I typed out everything (cut and paste actually) in List.txt in said script.
Unfortunately it gave me an output.txt including the last line, I assume as ‘Whizz’ contains ‘Whiz’.
I also t

针对需要从Data.txt中提取与List.txt共有的行的问题,通过将Data.txt的某字段转换为集合并与List.txt做交集运算,可以得到理想结果。文中提到尝试Shell脚本未成功,建议使用SPL语言,利用其集合运算符"^"进行交集操作,从而简洁高效地解决这个问题。
最低0.47元/天 解锁文章
686

被折叠的 条评论
为什么被折叠?



