1. supervised learning and unsupervised learning and examples of each.
2. what is the procedure of the text mining? (feature extraction, stop words, stemming word, weighting feature or frequency calculation)
3. How to deal with high dimensional feature space?
4. kernal transformation? (不确定是不是这个问题)
5. C++ experience
6. 还有一个关于 text mining 的问题己不清楚了, 好像是 term 什么的.
7. a senario analysis:
You have been provided a computer with DVD reader and USB communication. Some software have been installed. These software include Eclipse, Java, JVM. There is also software to transform files in any format into plain text file.
You have a DVD and an external hard drive where some bank statements in the format of Email, PDF etc are stored.
You are asked to create an excel file that includes all bank statment names in the DVD and external hard drive.
What are the steps and methodologies you will use to create this excel file?
8. 各种各样的 behavioral questions.
本文探讨了监督学习及无监督学习的概念与实例,并介绍了文本挖掘的基本流程,包括特征提取、停用词处理、词干提取及特征权重计算等步骤。此外,还讨论了高维特征空间的处理方法。
4601

被折叠的 条评论
为什么被折叠?



