awk - group adjacent rows by identical columns

本文介绍了一种使用awk命令处理表格数据的方法,具体为合并具有相同前三列的行,并将后续列的数据用逗号连接起来。同时提供了一个具体的示例和awk命令的详细解释。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Liang always brings me interesting quiz questions. Here is one:

If i have a table like below:

chr1	113438	114495	1	chr1	114142	114143
chr1	113438	114495	2	chr1	114171	114172
chr1	170977	174817	1	chr1	171511	171512
chr1	170977	174817	2	chr1	171514	171515
chr1	170977	174817	2	chr1	173545	173546

and I would like to collapse the rows if the first 3 columns are identical to make the following output:

chr1	113438	114495	114142,114143,114171,114172    
chr1	170977	174817	171511,171512,171514,171515,173545,173546

Is there any easy awk approach to do it?

Since I am so rusty at awk, I had to google around to find the solution:

awk -F '\t' '
$1FS$2FS$3==x{
    printf ",%s,%s", $6, $7
    next
}
{
    x=$1FS$2FS$3
    printf "\n%s\t%s,%s", x, $6, $7
}
END {
    printf "\n"
}' test.txt

Assuming the input file is test.txt. Note that the input and output are both tab-separated.

Explanation:

x=$1FS$2FS$3: variable x stores the value of columns 1, 2, and 3 separated by field separator FS.

Print the first part of an output line (columns 1, 2, 3, 6, 7).

For next line, if columns 1, 2, and 3 equal x, print columns 6 and 7.

 

Group and then count:

https://stackoverflow.com/questions/14916826/awk-unix-group-by

have this text file:

name, age
joe,42
jim,20
bob,15
mike,24
mike,15
mike,54
bob,21

Trying to get this (count):

joe 1
jim 1
bob 2
mike 3

 

awk -F, 'NR>1{arr[$1]++}END{for (a in arr) print a, arr[a]}' file.txt

 

 

References:

http://azaleasays.com/2014/10/06/awk-group-adjacent-rows-by-identical-columns/

 

Group rows in text file and aggregate corresponding rows to column

keeping last record among group of records with common fields (awk)

 

转载于:https://www.cnblogs.com/emanlee/p/7990097.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值