192. Word Frequency

本文介绍了一个bash脚本,用于计算文本文件中每个单词的出现频率,并按频率降序输出结果。该脚本适用于仅包含小写字母和空格的简单文本文件。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >



统计文件中单词出现频率
Write a bash script to calculate the frequency of each word in a text file words.txt.

For simplicity sake, you may assume:

    words.txt contains only lowercase characters and space ' ' characters.
    Each word must consist of lowercase characters only.
    Words are separated by one or more whitespace characters.

For example, assume that words.txt has the following content:

the day is sunny the the
the sunny is is

Your script should output the following, sorted by descending frequency:

the 4
is 3
sunny 2
day 1


#!/bin/bash
awk '{ for (i=1; i <= NF; ++i) { if (arr[$i] == 0) {arr[$i] = 1;} else { ++arr[$i]; }}}; END { for (k in arr) print k " " arr[k] | "sort -r -n -k2"; }' words.txt



通过管道,调用sort排序,-r 从大到小,-n 按照数字排序,-k2 以第2列排序;如果以key值排序 –k2 变成 -k1

       -n, --numeric-sort
              compare according to string numerical value

       -r, --reverse
              reverse the result of comparisons
       -k, --key=POS1[,POS2]
              start a key at POS1 (origin 1), end it at POS2 (default end of line).  See POS syntax below


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值