linux awk sort 统计ip出现次数

本文通过分析网站访问日志,统计了不同IP地址的访问次数,并对结果进行了排序,展示了如何使用awk和sort命令来处理大量日志数据,以便更好地理解用户行为。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

visit.log
180.153.114.199 - - [03/Jul/2013:14:44:43 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26s%3DVasiliki%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2355 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-admin/plugin-install.php?tab=search&type=term&s=Photogram&plugin-search-input=%E6%90%9C%E7%B4%A2%E6%8F%92%E4%BB%B6 HTTP/1.1 302 0 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26type%3Dterm%26s%3DPhotogram%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2370 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26type%3Dterm%26s%3DPhotogram%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2370 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26type%3Dterm%26s%3DPhotogram%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2370 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26type%3Dterm%26s%3DPhotogram%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2370 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26type%3Dterm%26s%3DPhotogram%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2370 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26type%3Dterm%26s%3DPhotogram%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2370 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26type%3Dterm%26s%3DPhotogram%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2370 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26type%3Dterm%26s%3DPhotogram%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2370 - Mozilla/4.0 -
101.226.33.200 - - [03/Jul/2013:14:45:52 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Fplugin-install.php%3Ftab%3Dsearch%26type%3Dterm%26s%3DPhotogram%26plugin-search-input%3D%25E6%2590%259C%25E7%25B4%25A2%25E6%258F%2592%25E4%25BB%25B6&reauth=1 HTTP/1.1 200 2370 - Mozilla/4.0 -
113.110.176.131 - - [03/Jul/2013:15:03:57 +0800] GET /wp-content/themes/catjia-lio/images/menu_hover_bg.png HTTP/1.1 304 0 http://demo.catjia.com/wp-content/themes/catjia-lio/style.css Mozilla/5.0 (Windows NT 6.2; WOW64; rv:21.0) Gecko/20100101 Firefox/21.0 -
180.153.205.103 - - [03/Jul/2013:15:13:59 +0800] GET /wp-admin/options-general.php HTTP/1.1 302 0 - Mozilla/4.0 -
180.153.205.103 - - [03/Jul/2013:15:13:59 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Foptions-general.php&reauth=1 HTTP/1.1 200 2269 - Mozilla/4.0 -
101.226.51.227 - - [03/Jul/2013:15:14:07 +0800] GET /wp-admin/options-general.php?settings-updated=true HTTP/1.1 302 0 - Mozilla/4.0 -
101.226.51.227 - - [03/Jul/2013:15:14:07 +0800] GET /wp-login.php?redirect_to=http%3A%2F%2Fdemo.catjia.com%2Fwp-admin%2Foptions-general.php%3Fsettings-updated%3Dtrue&reauth=1 HTTP/1.1 200 2291 - Mozilla/4.0 -


统计
awk '{a[$1]+=1;} END {for(i in a){print a[i]" "i;}}' log/visit.log
2 180.153.205.103
10 101.226.33.200
1 180.153.114.199
1 113.110.176.131
2 101.226.51.227


对统计结果排序
awk '{a[$1]+=1;} END {for(i in a){print a[i]" "i;}}' log/visit.log | sort
1 113.110.176.131
1 180.153.114.199
10 101.226.33.200
2 101.226.51.227
2 180.153.205.103


sort默认是升序的,10竟然没有排在最后,原来sort默认对一行的首字母进行排序
需要加入其它参数 -t 指定分隔符 -k 指定列 -g 按照常规数值排序 -n 根据字符串数值比较
awk '{a[$1]+=1;} END {for(i in a){print a[i]" "i;}}' log/visit.log | sort -t " " -k 1 -n
1 113.110.176.131
1 180.153.114.199
2 101.226.51.227
2 180.153.205.103
10 101.226.33.200


改为降序 -r
awk '{a[$1]+=1;} END {for(i in a){print a[i]" "i;}}' log/visit.log | sort -t " " -k 1 -n -r
10 101.226.33.200
2 180.153.205.103
2 101.226.51.227
1 180.153.114.199
1 113.110.176.131
<think>好的,用户想了解如何在Linux中使用命令行工具统计文件或日志中的出现次数。首先,我需要回顾常见的工具,比如awksort、uniq、grep和cut,这些是处理文本统计的常用命令。 用户提到的引用中有几个例子,比如使用awk统计IP出现的次数,或者用cut和sort结合uniq来处理特定字段。我应该先整理这些方法,确保覆盖不同的使用场景。比如,基本统计可以用uniq -c,但需要先排序,所以通常和sort搭配使用。对于按列处理的情况,可能需要用awk或cut提取特定字段,再进行统计。 另外,用户提供的示例命令中,有些涉及到多步骤的管道操作,比如引用5中的例子,用cut截取字符,sed替换字符,然后排序统计。我需要解释每个步骤的作用,这样用户能理解如何组合这些命令解决实际问题。 可能需要注意不同工具之间的差异,比如awk在处理列时更灵活,而cut适合简单的列分割。同时,引用3提到的perf和Instruments可能不太相关,但用户的问题集中在统计次数,所以主要关注文本处理工具。 还要考虑不同日志格式的情况,比如固定格式的日志如何用awk提取特定字段,或者如何过滤特定时间段的日志(如引用4中的Dec 10)。这些例子可以作为实际案例,帮助用户理解如何调整命令参数。 最后,需要验证这些方法的正确性,确保示例命令准确,比如检查awk的语法是否正确,避免单引号错误,以及管道顺序是否合理。同时,生成相关问题时要覆盖性能分析、优化和其他应用场景,帮助用户深入理解。</think>在Linux统计文件或日志中的出现次数,可通过以下命令行工具组合实现: --- ### 一、基础统计方法 1. **统计所有行的出现次数** ```bash cat filename | sort | uniq -c | sort -nr ``` - `sort`:排序文本(`uniq`依赖有序输入) - `uniq -c`:统计重复行及其次数 - `sort -nr`:按次数降序排列[^2] 2. **统计特定关键词的出现次数** ```bash grep "keyword" filename | wc -l ``` - `grep`过滤含关键词的行,`wc -l`统计行数[^1] --- ### 二、按字段统计次数 1. **统计第N列值的频率** ```bash awk '{count[$2]++} END {for (val in count) print val, count[val]}' filename ``` - 统计第2列不同值的出现次数 2. **截取字段后统计(如日志时间戳)** ```bash cut -d' ' -f4 filename | sort | uniq -c ``` - `-d' '`指定空格分隔符,`-f4`提取第4列[^5] --- ### 三、复杂场景示例 1. **统计日志中不同IP的访问次数** ```bash awk '{ip_count[$1]++} END {for (ip in ip_count) print ip, ip_count[ip]}' /var/log/access.log ``` - `$1`表示日志第1列(假设为IP地址) 2. **统计包含日期的日志条目(如12月10日)** ```bash awk '/Dec 10/ {print $0}' /opt/mongod/log/mongod.log | wc -l ``` - 使用正则匹配日期后统计行数[^4] --- ### 四、性能优化建议 - 对于大文件,优先使用`awk`替代多次管道操作(如`cut`+`sed`+`sort`组合),因其单进程处理效率更高[^3] - 若需分析执行耗时,可用`time`命令前缀(如`time awk '{...}'`) ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值