现代软件工程 作业 个人项目

本项目为MSRA高级软件工程项目,要求使用C#开发一个控制台应用程序,统计指定目录下所有文本文件(包括.txt、.cpp、.h、.cs等格式)中至少含有三个字母的单词出现频率,并按频率高低及字母顺序输出结果。项目需考虑两种模式:简单模式与扩展模式,并进行性能分析。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

MSRA Advanced Software Engineering

Project:  Individual Project - Word frequency program

2010/11/1

考察重点:

    基本算法的实现; 基本I/O;  字处理; 程序效能分析; 简单测试用例

 

Implement a console application to tally the frequency of words under a directory (based on 2 modes).

 

For all text files under a directory (recursively) (file extensions: "txt", "cpp", "h", “cs”),   calculate the frequency of each word, and output the result into a text file.  Write the code in C#, using .Net Framework,  the running environment is 32-bit Win7.

 

Run performance analysis tool on your code, find bottlenecks and improve.

 

Enable Code Quality Analysis for your code and get rid of all warnings.

 

Write  10 simple test cases to make sure your program can handle these cases correctly (e.g.  a good test case could be: one of the sub-directories is empty).

 

Submission:

·         Submit your source code and exe to TA, TA will run it on his testing environment and check for a) correctness and b) performance

·         Submit your test cases to TA.

 

Definition:

·         A word: a string with at least 3 letters, separated by delimiters. If a string contains non-alphanumerical letters, it’s not a word.  The word is case insensitive,  i.e. “file”, “FILE” and “File” are considered the same word.

·         Delimiter: space, non-alphanumerical letters (,.<>|\)[]{!@#$%^&*()_+=-}”).

·         Output text file: filename is <your email alias>.txt

o   Each line has this format

<word>: number

                Where “number” is the number of times this word appears in the scan.  The output should be sorted with most frequently word first.  If 2 words have the same frequency, list the words by alphabetical order.

 

Requirements:

1)     Simple mode.   Simple word frequency.

Myapp.exe <directory-name>

Will output <your-alias>.txt file in current directory,  the text file contains word ranking list.

2)     Extended mode.  If 2 words are different only on the ending numbers.  For example, we consider “win”, “win95” and “win7” are ONE WORD;  “Office” and “Office15” are the same.   “win”  and “win32a” are DIFFERENT words, as the difference are more than just ending numbers. “21century” and “century” are DIFFERENT words too.

 

When running with “-e” command line parameter,

Myapp.exe –e <directory-name>

 

Will output <your-alias>.txt file  in current directory,  the text file contains word ranking list, but the frequency is calculated based on the extended mode definition. 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值