Problem B. Reverse Words

本文解析了一道来自非洲资格赛的编程题目,通过C++实现字符串逆序输出的功能。代码使用了文件输入输出流来读取测试用例,并逐行逆序输出字符串。适合于初学者理解字符串操作及文件流的使用。

这好像是非洲那边的资格赛的题的说~~~~

不说上代码:

Unlike English, Chinese words are not seperated by space. It's hard for computer to divide a sentence into words correctly. However, you can manage it by doing these steps. 1.Analyis a large number of articles, and count all the words and their ocurrence frequency. 2.Find a way to seperate the sentence into words, so that the probability product of all the words is largest. That means a sentence s is divided into w1/w2/w3.../wn, because p(w1)*p(w2)*p(w3)...*p(wn) is larger than any other p(v1)*p(v2)...p(vm). If a sentence is nihaoshijie, the best solution is nihao/shijie ,because p(nihao) * p(shijie) is largest of all solutions. You are given the words and their frequency. Do the division. The First part is the words and frequency. Every line is a wordi, string contains only lower case characters and a integer f(wordi). To simplify the problem, f(wordi) is not the real frequency. You can think the frequency is p(wordi), f(wordi) = (int)(log(p(wordi)). So the sum of fs should be largest, and all fs shouldn't be 0. The length of word is less than 50. The number of words is less than 40000. All the integers can be handled with int. The first part is ended with "#END". The sencond part is the sentences to be divided. Each line is a string contains only lower case characters. The length of the line is less than 5000. There are T sentences to be divided. (T<5) Each sentence a line. Print the line with "/" to seperate the words.The solution is unique. Print a blank line after each sentence. 输入样例 an 1 chu 1 chuan 1 huan 1 hu 1 hua 1 chuang 5 qi 1 an 1 qian 5 mi 1 min 1 ming 5 yu 1 yue 5 e 1 gu 1 gua 1 guan 1 guang 5 #END chuangqianmingyueguang 输出样例 chuang/qian/ming/yue/guang
12-24
# 1. 定义停用词集合(常见无意义词) stop_words = {'the', 'and', 'is', 'to', 'of', 'in', 'a', 'that'} # 2. 输入文本 text = "Python is a powerful language,Python is widely used in data science." # 3. 文本预处理(转小写并分割单词) # 转小写:避免大小写差异(如"Python"和"python"视为相同单词) lower_text = text.lower() # 分割文本:按空格分割成单词列表 words = lower_text.split() # 4. 清理单词(去除标点符号) punctuation_str = ',.' # 可以扩展其他标点 cleaned_words = [] for word in words: clean_word = word.strip(punctuation_str) if clean_word: # 非空字符串才添加 cleaned_words.append(clean_word) # 5. 过滤停用词(移除无意义词) filtered_words = [] for word in cleaned_words: if word not in stop_words: filtered_words.append(word) # 6. 统计词频(计算每个单词出现次数) word_count = {} for word in filtered_words: # 如果单词不在字典中,添加并设置计数为0 if word not in word_count: word_count[word] = 0 else: word_count[word] += 1 # 7. 排序词频(从高到低) # 将字典转换为元组列表:[('python', 2), ('powerful', 1), ...] word_items = list(word_count.items()) # 按词频降序排序 # key参数说明:x[1]表示按每个元组的第二个元素(词频)排序 sorted_items = sorted(word_items, key=lambda x: x[1], reverse=True) # 8. 获取前3高频词 top3 = sorted_items[:3] # 取前3个元素 # 9. 输出结果 print("【词频统计结果】") for i, (word, count) in enumerate(top3, 1): print(f"{i}. {word}: {count}次")以上程序为什么无法实现相应功能,请在原程序基础上,尽量不改变原程序进行修改
12-10
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值