十三个经典排序算法概览

最新推荐文章于 2024-07-30 15:40:41 发布

你先画个包络面

最新推荐文章于 2024-07-30 15:40:41 发布

阅读量1.4k

点赞数 4

分类专栏：算法笔记文章标签：排序算法

本文链接：https://blog.youkuaiyun.com/qq_43549984/article/details/89426394

版权

算法笔记专栏收录该内容

7 篇文章

订阅专栏

首先给出各个排序算法性能的比较：¹

算法	最优时间复杂度	平均时间复杂度	最差时间复杂度	空间复杂度
`Bubble Sort`	$\Omega(n)$	$\Theta(n^2)$	$O(n^2)$	$O (1)$
`Selection Sort`	$Ω(n^2)$	$Θ(n^2)$	$O(n^2)$	$O (1)$
`Insertion Sort`	$\Omega(n)$	$\Theta(n^2)$	$O(n^2)$	$O (1)$
`Mergesort`	$\Omega(n \log n)$	$\Theta(n \log n)$	$\log n)$	$O (n)$
`Quicksort`	$\Omega(n \log n)$	$\Theta(n \log n)$	$O(n^2)$	$O(\log n)$
`Heapsort`	$\Omega(n \log n)$	$\Theta(n \log n)$	$\log n)$	$O (1)$
`Timsort`	$\Omega(n)$	$\Theta(n \log n)$	$\log n)$	$O (n)$
`Tree Sort`	$\log n)$	$\log n)$	$O(n^2)$	$O (n)$
`Shell Sort`	$\log n)$	$Θ(n(\log n)^2)$	$O(n(\log n)^2)$	$O (1)$
`Bucket Sort`	$Ω (n + k)$	$Θ (n + k)$	$O(n^2)$	$O (n)$
`Radix Sort`	$Ω (n k)$	$Θ (n k)$	$O (n k)$	$O (n + k)$
`Counting Sort`	$Ω (n + k)$	$Θ (n + k)$	$O (n + k)$	$O (k)$
`Cubesort`	$Ω (n)$	$\log n)$	$\log n)$	$O (n)$

粗略地来说，排序算法分两种，一种是基于比较的，复杂度下限 $O(n\log n)$ （有数学严格证明，可以参见这篇文章），一种是不基于比较的，在某些特定情况下可以达到 $O (n)$ 。现实生活中，基于比较的排序算法相较来说更具有工程学意义。

Bubble Sort

Bubble Sort是一种简单的排序算法，它反复遍历列表，比较相邻的两个元素，如果顺序不对，则交换它们。该算法是一种比较排序，它是以较小或较大元素“冒泡”到列表顶部的方式命名的。虽然算法很简单，但对于大多数问题来说，它太慢了。

算法的伪代码如下：

algorithm BubbleSort(A) is
    n := A.length
    do
        swapped := false
        for i := 1 to n - 1 do
            if A[i - 1] > A[i] then
                swap A[i-1] with A[i]
                swapped = true
    while swapped

Selection Sort

Selection Sort是一种就地比较排序算法，时间复杂度 $O(n^2)$ ，因此它在大型列表中效率比较低下。
但该算法以其简单性著称，在某些情况下，它比更复杂的算法具有性能优势，特别是当辅助内存有限的时候。

Selection Sort将输入列表分为两部分：已排序子列表（在列表的前端），以及待排序子列表。最初，已排序子列表是空的，未排序子列表是整个列表。算法通过在待排序子列表中查找最小的元素，将其与最左边的待排序元素交换，并将已排序子列表边界向右移动一个元素来继续。

算法的伪代码如下：

algorithm SelectionSort(A) is
	n := A.length
	for j := 1 to n - 1 do
		iMin := j
		for i := j + 1 to n do
			if A[i] < A[iMin] then
				iMin := i
		if iMin != j then
			swap A[j] with A[iMin]

Insertion Sort

Insertion Sort是一种简单的排序算法，它只需要一次遍历即可生成最终排序的数组。它在大列表中的效率比更高级的算法低，但是，它有以下几个优点：
一、实现简单，几行代码即可完成。
二、对（相当）小的数据集很有效。
三、在实践中比其他简单的 $O(n^2)$ 算法更有效。
四、自适应，当输入中的每个元素离其最终位置不超过 $k$ 时，时间复杂度仅 $O (k n)$ 。
五、稳定，不改变具有相等值的元素的相对顺序。
六、空间复杂度低，仅 $O (1)$ 。
七、在线算法，可以边读边排序，不需要先读取全部数组。

其伪代码如下：

algorithm InsertionSort(A) is
	i := 1
	while i < A.length do
    	j := i
    	while j > 0 and A[j-1] > A[j] do
        	swap A[j] with A[j-1]
        	j := j - 1
    	i := i + 1

Merge sort

Merge sort是一种高效、通用、基于比较的排序算法，由John von neumann于1945年发明。该算法利用了分治的思想，将规模较大的排序问题化归到较小的规模上解决。

Merge sort的步骤如下：
一、将未排序的列表划分为两个元素数量相同的子数组。
二、排序这两个子数组，再将它们进行合并。
伪代码如下：

algorithm MergeSort(A, p, r) is
	if p < r then
		q := ⌊(p + r) / 2⌋
		MergeSort(A, p, q)
		MergeSort(A, q + 1, r)
		Merge(A, p, q, r)

algorithm Merge(A, p, q, r) is
	n1 := q - p + 1
	n2 := r - q
	let L[1..n1 + 1] and R[1..n2 + 1] be new arrays
	for i := 1 to n1 do
		L[i] := A[p + i - 1]
	for j := 1 to n2 do
		R[j] := A[q + j]
	L[n1 + 1] := ∞
	R[n2 + 1] := ∞
	i := 1
	j := 1
	for k := p to r do
		if L[i] <= R[j] then
			A[k] := L[i]
			i := i + 1
		else
			A[k] := R[j]
			j := j + 1

Quick sort

Quick sort是一种有效的比较排序算法，1959年由英国计算机科学家Tony Hoare开发并于1961年出版。这是一种很常用的排序算法，如果实现得好，它可以比Merge sort和Heap sort快两到三倍。

Quick sort依然利用了分治的思想，其步骤如下：
一、从数组中选择一个元素，称为pivot。
二、对数组进行排序，使所有小于pivot的元素都位于pivot之前，而所有值大于pivot的元素都位于pivot之后（相等的值可以朝任何方向移动）。这一步操作通常称为partition。
三、递归地将上述步骤应用于pivot之前和之后的子数组。
递归的基本情况是大小为0或1的数组，它们是按定义排列的，因此不需要对它们进行排序。

Quick sort也有几种不同的方式进行，具体实现方案的选择对算法的性能有很大的影响。
下面简述两种方案：

Lomuto partition scheme
这一方案由Nico Lomuto所创，并由Bentley和Cormen等人推广。这个方案通常选择数组中的最后一个元素作为pivot，然后从前向后遍历，发现比pivot小的值就依次将它们与数组前端的值交换，如此往复。伪代码如下：

algorithm QuickSort(A, lo, hi) is
    if lo < hi then
        p := Partition(A, lo, hi)
        QuickSort(A, lo, p - 1)
        QuickSort(A, p + 1, hi)

algorithm Partition(A, lo, hi) is
    pivot := A[hi]
    i := lo
    for j := lo to hi - 1 do
        if A[j] < pivot then
            swap A[i] with A[j]
            i := i + 1
    swap A[i] with A[hi]
    return i

Hoare partition scheme
Hoare给出的原始方案是，选择数组中间的值作为pivot，然后从两端向中间遍历，遍历的过程中如果遇到两个数，左边的大于pivot，右边的小于pivot，就将这两个数交换。
伪代码如下：

algorithm QuickSort(A, lo, hi) is
    if lo < hi then
        p := Partition(A, lo, hi)
        QuickSort(A, lo, p)
        QuickSort(A, p + 1, hi)

algorithm Partition(A, lo, hi) is
    pivot := A[(lo + hi) / 2]
    i := lo - 1
    j := hi + 1
    loop forever
        do
            i := i + 1
        while A[i] < pivot
        do
            j := j - 1
        while A[j] > pivot
        if i >= j then
            return j
        swap A[i] with A[j]

不难发现，Quick sort存在两个弊端：
一、排序不够稳定，相等的值在排序前后顺序可能会改变。
二、无法良好应对已经排序好的情况。

Heap sort

Heap sort是由J.W.J.Williams于1964年发明的，这也是Heap的诞生时间。该算法可以被看作是一种改进的Selection Sort，改进之处在于使用堆的数据结构来查找最大值。
尽管Heap sort在大多数机器上的实际速度比Quick Sort要慢一些，但它在最坏情况下的时间复杂度仅 $\log n)$ ，快于Quick sort的 $O(n^2)$ 。
Heap sort是就地排序，但不是稳定排序。

该算法描述如下：

algorithm HeapSort(A) is
	BuildMaxHeap(A)
	for i := A.length downto 2 do
		swap A[1] with A[i]
		A.heap_size := A.heap_size - 1
		max_heapify(A, 1)

algorithm BuildMaxHeap(A) is
	A.heap_size := A.length
	for i := parent(A.length) downto 1
		MaxHeapify(A, i)

algorithm MaxHeapify(A, i) is
	l := left(i)
	r := right(i)
	if l <= A.heap_size and A[l] > A[i] then
		largest := l
	else
		largest := i
	if r <= A.heap_size and A[r] > A[largest] then
		largest := r
	if largest != i then
		swap A[i] with A[largest]
		MaxHeapify(A, largest)

algorithm parent(i) is
	return ⌊i / 2⌋

algorithm left(i) is
	return 2 * i

algorithm right(i) is
	return 2 * i + 1

Tim sort

Tim sort由Tim Peters在2002年实现，用于Python编程语言，是一种混合稳定排序算法。它由Merge sort和Insertion sort派生而来，用于处理各种real-world数据，其时间复杂度优于出现在它之前的所有排序算法。这一优越性来自于它查找已经排好序的数据的子序列，并使用得到的信息更有效地对剩余部分进行排序。自2.3版以来，Tim sort一直是Python的标准排序算法。它还应用在其他多个语言中，如android、Java等。

Tim sort的工作原理为，将输入序列分为若干个小块，称为run，用Insertion sort将这些小块分别排序好（或者它们本身就已经是排好序的），然后用Merge操作将这些小块合并。run的大小一般取在32到64。

以下是一个Python版本的Tim sort算法描述：²

# Python3 program to perform TimSort. 
RUN = 32
	
# This function sorts array from left index to 
# to right index which is of size atmost RUN 
def insertionSort(arr, left, right): 

	for i in range(left + 1, right+1): 
	
		temp = arr[i] 
		j = i - 1
		while arr[j] > temp and j >= left: 
		
			arr[j+1] = arr[j] 
			j -= 1
		
		arr[j+1] = temp 
	
# merge function merges the sorted runs 
def merge(arr, l, m, r): 

	# original array is broken in two parts 
	# left and right array 
	len1, len2 = m - l + 1, r - m 
	left, right = [], [] 
	for i in range(0, len1): 
		left.append(arr[l + i]) 
	for i in range(0, len2): 
		right.append(arr[m + 1 + i]) 
	
	i, j, k = 0, 0, l 
	# after comparing, we merge those two array 
	# in larger sub array 
	while i < len1 and j < len2: 
	
		if left[i] <= right[j]: 
			arr[k] = left[i] 
			i += 1
		
		else: 
			arr[k] = right[j] 
			j += 1
		
		k += 1
	
	# copy remaining elements of left, if any 
	while i < len1: 
	
		arr[k] = left[i] 
		k += 1
		i += 1
	
	# copy remaining element of right, if any 
	while j < len2: 
		arr[k] = right[j] 
		k += 1
		j += 1
	
# iterative Timsort function to sort the 
# array[0...n-1] (similar to merge sort) 
def timSort(arr, n): 

	# Sort individual subarrays of size RUN 
	for i in range(0, n, RUN): 
		insertionSort(arr, i, min((i+31), (n-1))) 
	
	# start merging from size RUN (or 32). It will merge 
	# to form size 64, then 128, 256 and so on .... 
	size = RUN 
	while size < n: 
	
		# pick starting point of left sub array. We 
		# are going to merge arr[left..left+size-1] 
		# and arr[left+size, left+2*size-1] 
		# After every merge, we increase left by 2*size 
		for left in range(0, n, 2*size): 
		
			# find ending point of left sub array 
			# mid+1 is starting point of right sub array 
			mid = left + size - 1
			right = min((left + 2*size - 1), (n-1)) 
	
			# merge sub array arr[left.....mid] & 
			# arr[mid+1....right] 
			merge(arr, left, mid, right) 
		
		size = 2*size 
		
# utility function to print the Array 
def printArray(arr, n): 

	for i in range(0, n): 
		print(arr[i], end = " ") 
	print() 

	
# Driver program to test above function 
if __name__ == "__main__": 

	arr = [5, 21, 7, 23, 19] 
	n = len(arr) 
	print("Given Array is") 
	printArray(arr, n) 
	
	timSort(arr, n) 
	
	print("After Sorting Array is") 
	printArray(arr, n) 
	
# This code is contributed by Rituraj Jain

Tree Sort

Tree Sort是一种利用数据结构排序的比较排序算法，它利用待排序的元素构建一个二叉搜索树，然后按照中序遍历顺序遍历元素即可得到非递减序列。它的典型用途是在线地对元素进行排序，即每次插入之后，当前所有元素都排好序。此算法虽然复杂度不高，但和与之类似的Heap Sort相比，写法上略复杂了一些（而这实际上据说也是heap创立的初衷）。

伪代码如下：

struct Node
	integer: key
	Node: left, right

algorithm NewNode(item) is
	let temp be a new Node struct
	temp.key := item
	temp.left := temp.right := NULL
	return temp

alogrithm insert(node, key) is
	if node == NULL then
		return NewNode(key)
	if key < node.key then
		node.left := insert(node.left, key)
	else if key > node.key then
		node.right := insert(node.right, key)
	return node

algorithm inorder(root, A, i) is
	if root != NULL then
		inorder(root.left, A, i)
		A[i] = root.key
		i := i + 1
		inorder(root.right, A, i)

algorithm TreeSort(A) is
	let root be a new Node struct
	root := insert(root, A[0])
	for i := 1 to n - 1 do:
		insert(root, A[i])
	i := 1
	inorder(root, A, i)

Shell Sort

Shell Sort，是一种就地比较排序。它可以看作是Bubble Sort或Insertion Sort的推广。该方法首先对彼此相距很远的元素对进行排序，然后逐步减小要比较的元素之间的距离。Donald Shell在1959年出版了这种类型的第一个版本。Shell Sort的运行时间很大程度上取决于它使用的步长gap。对于许多实际的变体，确定它们的时间复杂性仍然是一个开放的问题。

伪代码如下：

algorithm ShellSort(A) is
	// using Marcin Ciura's gap sequence (published in 2001)
	let gaps be a sequence [701, 301, 132, 57, 23, 10, 4, 1]
	for each gap in gaps:
		for i := gap to n - 1 do
        	temp := A[i]
        	for j := i downto gap by gap steps do
        		if A[j - gap] <= temp then
        			break
            	A[j] := A[j - gap]
        	A[j] = temp

下表比较了一些常用的gap序列。³

Gap序列	复杂度
$\left\{\left\lfloor {\frac {N}{2^{k}}}\right\rfloor \bigg\vert1\leqslant k\leqslant\log_2N\right\}$	$\Theta \left(N^{2}\right)$
$\left\{2\left\lfloor {\frac {N}{2^{k+1}}}\right\rfloor +1 \bigg\vert1\leqslant k\leqslant\log_2N\right\}$	$\Theta \left(N^{\frac {3}{2}}\right)$
$\{2^k - 1\vert k \geqslant 1\}$	$\Theta \left(N^{\frac {3}{2}}\right)$
$\{2^k + 1\vert k \geqslant 1\}\cup\{1\}$	$\Theta \left(N^{\frac {3}{2}}\right)$
$\{x\vert \exist p,q\in \mathbb{N},2 ^p 3^q=x\}$	$\Theta \left(N\log ^{2}N\right)$
$\left\{{x=\frac {3^{k}-1}{2}\bigg\vert x \leqslant \left\lceil {\frac {N}{3}}\right\rceil}\right\}$	$\Theta \left(N^{\frac {3}{2}}\right)$
$\cdots$	$\cdots$

Bucket Sort

Bucket Sort是一种分布排序算法，该算法先将数组元素分布到多个bucket中，然后使用不同的排序算法，或者递归地应用Bucket Sort算法，分别对每个bucket进行排序。Bucket Sort可以通过比较来实现，因此也可以视为比较排序算法。计算的复杂性取决于用于对每个bucket进行排序的算法、要使用的bucket数量，以及输入是否均匀分布。

Bucket Sort的工作原理如下：

设置一个最初为空的buckets数组。
遍历原始数组，将每个对象放入其对应的bucket中。
对每个非空bucket进行排序。
按顺序访问bucket，并将所有元素放回原始数组。

algorithm BucketSort(A, k) is
	let buckets be a new array of k empty lists
	M := the maximum key value in the array
	for i := 1 to A.length do
		insert A[i] into buckets[floor(A[i] / M * k)]
	for i := 1 to k do
		nextSort(buckets[i])
  return the concatenation of buckets[1], ..., buckets[k]

Radix Sort

Radix Sort是一种非比较整数排序算法，它通过将键按具有相同有效位置和值的单个数字进行分组，从而对具有整数键的数据进行排序。由于用整数还可以表示某些字符串（例如名称或日期）和特殊格式的浮点数，因此Radix Sort也不限于整数。Radix Sort最早可以追溯到1887年Herman Hollerith在制表机上的工作。

Radix Sort的实现既可以从最高有效位（MSD）开始，也可以从最低有效位（LSD）开始。LSD的排序顺序为：短键在长键之前，然后相同长度的键按字典顺序排序。这与整数表示的正常顺序一致。MSD的排序顺序为字典序，它适用于对字符串（如单词）或固定长度整数表示进行排序。

Counting Sort（下文所述）用来解决范围在1到n的数字的排序，而Radix Sort则可以解决范围在1到n^2的数字。

LSD的工作原理为，以每个数字从低位到高位的数字为键排序若干次，直到排完所有位置。而每次排序都使用Counting Sort算法。

该算法的C++版本实现如下：⁴

// C++ implementation of Radix Sort 
#include<iostream> 
using namespace std; 

// A utility function to get maximum value in arr[] 
int getMax(int arr[], int n) 
{ 
	int mx = arr[0]; 
	for (int i = 1; i < n; i++) 
		if (arr[i] > mx) 
			mx = arr[i]; 
	return mx; 
} 

// A function to do counting sort of arr[] according to 
// the digit represented by exp. 
void countSort(int arr[], int n, int exp) 
{ 
	int output[n]; // output array 
	int i, count[10] = {0}; 

	// Store count of occurrences in count[] 
	for (i = 0; i < n; i++) 
		count[ (arr[i]/exp)%10 ]++; 

	// Change count[i] so that count[i] now contains actual 
	// position of this digit in output[] 
	for (i = 1; i < 10; i++) 
		count[i] += count[i - 1]; 

	// Build the output array 
	for (i = n - 1; i >= 0; i--) 
	{ 
		output[count[ (arr[i]/exp)%10 ] - 1] = arr[i]; 
		count[ (arr[i]/exp)%10 ]--; 
	} 

	// Copy the output array to arr[], so that arr[] now 
	// contains sorted numbers according to current digit 
	for (i = 0; i < n; i++) 
		arr[i] = output[i]; 
} 

// The main function to that sorts arr[] of size n using 
// Radix Sort 
void radixsort(int arr[], int n) 
{ 
	// Find the maximum number to know number of digits 
	int m = getMax(arr, n); 

	// Do counting sort for every digit. Note that instead 
	// of passing digit number, exp is passed. exp is 10^i 
	// where i is current digit number 
	for (int exp = 1; m/exp > 0; exp *= 10) 
		countSort(arr, n, exp); 
} 

// A utility function to print an array 
void print(int arr[], int n) 
{ 
	for (int i = 0; i < n; i++) 
		cout << arr[i] << " "; 
} 

// Driver program to test above functions 
int main() 
{ 
	int arr[] = {170, 45, 75, 90, 802, 24, 2, 66}; 
	int n = sizeof(arr)/sizeof(arr[0]); 
	radixsort(arr, n); 
	print(arr, n); 
	return 0; 
}

Counting Sort

Counting Sort是一种基于特定范围之间的键的排序技术。它的工作原理是计算具有不同键值的对象的数量，然后做一些算法来计算输出序列中每个对象的位置。

举例来说，对于输入序列[1, 4, 1, 2, 7, 5, 2]，我们先用一个大小为8的数组统计每个数字出现的次数，即[0, 2, 2, 0, 1, 1, 0, 1]，然后对这个序列做一次partial_sum，得到[0, 2, 4, 4, 5, 6, 6, 7]，则此序列及表示每一个数字在最终排好序的序列中的位置（从1开始）。输出时，对输入序列进行遍历，将遍历到的数字填入相应位置中，然后将其位置减少1。对于本例，我们首先将1填入位置2，然后位置2减少到1，接着将数字4填入位置5，然后位置5减少到4，依此类推。

本算法的C++版本实现如下：⁵

//Counting sort which takes negative numbers as well 
#include <iostream> 
#include <vector> 
#include <algorithm> 
using namespace std; 

void countSort(vector <int>& arr) 
{ 
	int max = *max_element(arr.begin(), arr.end()); 
	int min = *min_element(arr.begin(), arr.end()); 
	int range = max - min + 1; 
	
	vector<int> count(range), output(arr.size()); 
	for(int i = 0; i < arr.size(); i++) 
		count[arr[i]-min]++; 
		
	for(int i = 1; i < count.size(); i++) 
		count[i] += count[i-1]; 
	
	for(int i = arr.size()-1; i >= 0; i--) 
	{ 
		output[ count[arr[i]-min] -1 ] = arr[i]; 
			count[arr[i]-min]--; 
	} 
	
	for(int i=0; i < arr.size(); i++) 
			arr[i] = output[i]; 
} 

void printArray(vector <int> & arr) 
{ 
	for (int i=0; i < arr.size(); i++) 
		cout << arr[i] << " "; 
	cout << "\n"; 
} 

int main() 
{ 
	vector<int> arr = {-5, -10, 0, -3, 8, 5, -1, 10}; 
	countSort (arr); 
	printArray (arr); 
	return 0; 
}