Unix命令求文件差集

 比较2个文件的不同,在数据库里很容易操作

在Unix中

sort file1>file1.1
sort file2>file2.2

comm -13 file1.1 file2.2>oo
comm -23 file1.1 file2.2>xx

sort file1>file1.1
sort file2>file2.2

comm -13 file1.1 file2.2>oo
comm -23 file1.1 file2.2>xx

Comm 命令 (Unix/Linux/Cygnu)
如果想对两个有序的文件进行比较,可以使用comm命令。
comm - 12 file1 file2 就只显示在两个文件中都存在的行;
comm - 23 file1 file2 只显示在第一个文件中出现而未在第二个文件中出现的行;
comm - 123 file1 file2 则什么也不显示。
?完整解说
Comm命令
如果想对两个有序的文件进行比较,可以使用comm命令。
语法:comm [- 123 ] file1 file2
说明:该命令是对两个已经排好序的文件进行比较。其中file1和file2是已排序的文件。comm读取这两个文件,然后生成三列输出:仅在file1中出现的行;仅在file2中出现的行;在两个文件中都存在的行。如果文件名用”- “,则表示从标准输入读取。
选项1、2或3抑制相应的列显示。例如
comm - 12就只显示在两个文件中都存在的行;
comm - 23只显示在第一个文件中出现而未在第二个文件中出现的行;
comm - 123则什么也不显示。

 

需要注意的事,使用comm之前,两个文件都是必须是sort好了的。

以下内容转自这里。
#####################################
In our work, we often encounter the following questions:

I have two files: file1 and file2:
1) How can I print out the lines that are only contained in file1?
2) How can I print out the lines that are only contained in file2?
3) How can I print out the lines that are contained both in file1 and file2?

There is a powerful shell command that can easily meet our needs, it is: comm. When you meet the above questions, "comm" should be your first choice:-)

comm [ -123 ]  file1  file2

comm will read file1 and file2 and generate three columns of output: lines only in file1; lines only  in file2; and lines in both files. For detailed explanation, pls man comm.

Example:

bash-2.03$ cat file1
11111111
22222222
33333333
44444444
55555555
66666666
77777777
88888888
99999999
bash-2.03$ cat file2
00000000
22222222
44444444
66666666
88888888

1)  Print out the lines that are only contained in file1?
bash-2.03$ comm -23 file1 file2
11111111
33333333
55555555
77777777
99999999

2)  Print out the lines that are only contained in file2?
bash-2.03$ comm -13 file1 file2
00000000

3)  Print out the lines that are contained both in file1 and file2
bash-2.03$ comm -12 file1 file2
22222222
44444444
66666666
88888888

Besides the comm, we still have various ways to finish the above tasks.

1)  Print out the lines that are only contained in file1?
diff file1 file2 | grep "^<"|sed 's/^< //g'

for i in $(<file1); do (grep $i file2)||echo $i>>temp ; done;
      cat temp


In comparison, comm is much easier to remember. :-)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值