大家都会遇到这个问题。取到的数据量很大,想看看数据的分布,奈何画出图来总是由于噪声,导致层次不齐,反而淹没了规律。比较简单的方式就是取平均,平均值相比而言更具有统计意义,只要实例够多,就可以当作一种无偏估计。所以抽空做了桶划分的函数。具体做了两种,一种是桶大小一样的,一种是指数的。
平均方法如下:
function aimlist = AvgBucket(arg_list,arg_bucketsize)
% devide the arg_list into length(arg_list)/arg_bucketsize buckets.
%list是目标列表.
%backetsize是每个桶的大小.
%aimlist是返回的桶list.
%sumCurrentBucket 是每个桶的大小
list = arg_list;
bucketsize = arg_bucketsize;
aimlist=zeros(length(list) / bucketsize,1);
sumCurrentBucket = 0;
avgCurrentBucket = 0;
aimorder = 1;
for i=1:length(list)
sumCurrentBucket = sumCurrentBucket + list(i);
if mod(i,bucketsize) == 0
avgCurrentBucket = sumCurrentBucket / bucketsize;
aimlist(aimorder) = avgCurrentBucket;
sumCurrentBucket = 0;
avgCurrentBucket = 0;
aimorder = aimorder + 1;
end
end
if sumCurrentBucket ~= 0
avgCurrentBucket = sumCurrentBucket / bucketsize;
aimlist(aimorder) = avgCurrentBucket;
sumCurrentBucket = 0;
avgCurrentBucket = 0;
end
%end
指数方法如下:
function aimlist = ExpBucket(arg_list,arg_base,arg_exponent)
% devide the arg_list into buckets.
%arg_list是目标列表.
%arg_exponent是每个桶的指数.
%arg_base是桶的底数.
%桶的大小由指数和桶序号指定.
%aimlist是返回的桶list.
%sumCurrentBucket 是每个桶的大小
list = arg_list;
base = arg_base;
exponent = arg_exponent;
aimlist=zeros(length(list),1);
sumCurrentBucket = 0;
bucketsize = 0;
aimorder = 1;
for i=1:length(list)
sumCurrentBucket = sumCurrentBucket + list(i);
bucketsize = bucketsize + 1;
if mod(i,base ^ exponent) == 0
avgCurrentBucket = sumCurrentBucket / bucketsize;
aimlist(aimorder) = avgCurrentBucket;
sumCurrentBucket = 0;
avgCurrentBucket = 0;
bucketsize = 0;
aimorder = aimorder + 1;
base = base + 1;
end
end
if sumCurrentBucket ~= 0
avgCurrentBucket = sumCurrentBucket / bucketsize;
aimlist(aimorder) = avgCurrentBucket;
sumCurrentBucket = 0;
avgCurrentBucket = 0;
end
aimlist = aimlist(1:aimorder);
%end