Motivating Challenges in Data Mining

本文探讨了数据挖掘领域的五大挑战:可扩展性、高维数据处理、异构复杂数据、数据所有权与分布以及非传统分析任务。面对海量数据集,算法必须具备可扩展性;随着维度增加,计算复杂度急剧上升;同时,处理不同类型的数据、地理分布的数据所有权及自动化生成与评估大量假设成为当前数据分析师的重要课题。

1. Scalability

If data mining algorithms are to handle these massive data sets, then they must be scalable.

2. High Dimensionality

For some data analysis algorithms, the computational complexity increases rapidly as the dimensionality increases.

3. Heterogeneous and Complex Data

Dealing with data with not the same type.

4. Data Ownership and Distribution

Data is geographically distributed among resources belonging to multiple entities.

5. Non-traditional Analysis

The traditional statistical approach is based on a hypothesize-and-test paradigm.

Current data analysis tasks often require the generation and evaluation of thousands of hypotheses, and consequently, the development of some data mining techniques has been motivated by the desire to automate the process of hypothesis generation and evaluation.

转载于:https://www.cnblogs.com/johnpher/archive/2013/01/18/2866971.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值