多线程之K-近邻算法(二) 粗粒度并发版本

  在上一篇文章
多线程之K-近邻算法(二) 细粒度并发版本
中,简单的讲述了通过执行器来完成的K-近邻算法的细粒度版本,也许会有人想到这个版本的并发方案会存在一定的问题:执行的任务太多了,由于创建的执行器最大工作线程数为numThreads,因此,一个新的方案就是仅启动numThreads个任务,并将训练数据划分为numThreads个组去计算输入范例和对应组训练范例之间的距离
  根据以上的设计思路,可以在之前KnnClassifierParallelIndividual算法的基础上进行修正,主要对classify方法进行修正,代码如下

import com.Knnclassifier.Distance;
import com.Knnclassifier.Sample;

import java.util.*;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadPoolExecutor;

public class KnnClassifierParallelGroup {
    private final List<? extends Sample> dataSet;
    private final int k;
    private ThreadPoolExecutor executor;
    private final int numThreads;
    private final boolean parallelSort;

    public KnnClassifierParallelGroup(List<? extends Sample> dataSet, int k, int factor, boolean parallelSort) {
        this.dataSet = dataSet;
        this.k = k;
        this.numThreads = factor * (Runtime.getRuntime().availableProcessors());
        this.executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(numThreads);
        this.parallelSort = parallelSort;
    }

    public String classify(Sample sample) throws Exception {
        Distance[] distances = new Distance[dataSet.size()];
        CountDownLatch endController = new CountDownLatch(numThreads);

        int length = dataSet.size() / numThreads;
        int startIndex = 0, endIndex = length;

        for(int i=0; i<numThreads;i++) {
            GroupDistanceTask task = new GroupDistanceTask(distances, startIndex, endIndex,
                    sample, dataSet, endController);
            startIndex = endIndex;
            if(i<numThreads -2) {
                endIndex = endIndex + length;
            } else {
                endIndex = dataSet.size();
            }
            executor.execute(task);
        }
        endController.await();
        if(parallelSort) {
            Arrays.parallelSort(distances);
        } else {
            Arrays.sort(distances);
        }

        executor.shutdown();

        Map<String, Integer> results = new HashMap<>();
        for(int i = 0; i < k; i++) {
            Sample localExample = dataSet.get(distances[i].getIndex());
            String tag = localExample.getTag();
            results.merge(tag, 1, (a,b) ->a+b);
        }
        return  Collections.max(results.entrySet(),
                Map.Entry.comparingByValue()).getKey();
    }
}

同理需要修正计算任务GroupDistanceTask代码,修正如下

import com.Knnclassifier.Distance;
import com.Knnclassifier.EuclideanDistanceCalculator;
import com.Knnclassifier.Sample;

import java.util.List;
import java.util.concurrent.CountDownLatch;

public class GroupDistanceTask implements Runnable {
    private final Distance[] distances;
    private final int startIndex, endIndex;
    private final Sample example;
    private final List<? extends Sample> dataSet;
    private final CountDownLatch endController;

    public GroupDistanceTask(Distance[] distances, int startIndex,
                             int endIndex, Sample example,
                             List<? extends Sample> dataSet, CountDownLatch endController) {
        this.distances = distances;
        this.startIndex = startIndex;
        this.endIndex = endIndex;
        this.example = example;
        this.dataSet = dataSet;
        this.endController = endController;
    }

    @Override
    public void run() {
        for(int index = startIndex; index < endIndex; index++) {
            Sample localExample = dataSet.get(index);
            distances[index] = new Distance();
            distances[index].setIndex(index);
            distances[index].setDistance(EuclideanDistanceCalculator
                    .calculate(localExample,example));
        }
        endController.countDown();
    }
}

启动类替换修改以下即可,在这里就不展示了,执行代码效果如下
在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值