计数排序笔记

最新推荐文章于 2025-09-06 17:14:26 发布

原创最新推荐文章于 2025-09-06 17:14:26 发布 · 667 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#算法

Data Structures & Algorithm 专栏收录该内容

4 篇文章

订阅专栏

基本思想：【摘自算法导论第三版P108】

对每一个输入元素x,确定小于x的元素的个数。利用这一信息,就可以直接把x放到它在输出数组中的位置上了。

serial code for sorting an array of 8-bit unsigned numbers:

void CountingSort( unsigned char* a, unsigned long a_size )
{ 
    const unsigned long numberOfCounts = 256;
 
    // one count for each possible value of an 8-bit element (0-255)
    unsigned long count[ numberOfCounts ] = { 0 };      // count array is initialized to zero by the compiler
 
    // Scan the array and count the number of times each value appears
    for( unsigned long i = 0; i < a_size; i++ )
        count[ a[ i ] ]++;
 
    // Fill the array with the number of 0's that were counted, followed by the number of 1's, and then 2's and so on
    unsigned long n = 0;
    for( unsigned long i = 0; i < numberOfCounts; i++ )
        for( unsigned long j = 0; j < count[ i ]; j++ )
            a[ n++ ] = (unsigned char)i;
}

test code:

#include <vector>
#include <iostream>

using namespace std;

typedef unsigned int uint;

const int numData = 100;
const uint k = 256;

void countSort(vector<int>& inputVec)
{
	const uint n = inputVec.size();

	// initialize count array
	vector<uint> countVec(k, 0);

	// count each rank value occurrences
	for (uint i = 0; i < n; ++i)
		++countVec[inputVec[i]];

	// 
	uint z = 0;
	for (uint i = 0; i < k; ++i)
		for (uint j = 0; j < countVec[i]; ++j)
			inputVec[z++] = i;
}

void printData(vector<int> &testVec)
{
	const uint numData = testVec.size();
	for (int i = 0; i < numData; ++i)
		cout << testVec[i] << "	";

	cout << endl;
}

int main()
{
	// test data
	vector<int> testVec;
	testVec.reserve(numData);
	for (int i = 0; i < numData; ++i)
	{
		uint tempVal = (i * i) % k;
		testVec.push_back(tempVal);
	}
		
	cout << "test data before sorting : " << endl;
	printData(testVec);

	// counting sort
	countSort(testVec);

	cout << "test data after sorting : " << endl;
	printData(testVec);

	return 0;
}

小结：

计数排序比较适合取值范围k比较小而数据量很大的情形（ k << n）

Parallel Counting Sort 参考：

http://www.drdobbs.com/architecture-and-design/parallel-counting-sort/224700144

CUDA Counting Sort:

1. 待排序数据从CPU拷贝到GPU （GPU only 操作可忽略一次数据传输开销，建议：use host pinned memory）

2. 计数数组的初始化 (for GPU only scenario, 分配一次，然后每一次排序前use cudaMemset 恢复初值)

3. 计数统计

4. 填充输出数组

5. 已排序数据从GPU拷贝回CPU （GPU only 操作可忽略一次数据传输开销，建议：use host pinned memory）

to be continued...