KNOW: Sort & Search

最新推荐文章于 2025-10-29 22:36:13 发布

原创最新推荐文章于 2025-10-29 22:36:13 发布 · 867 阅读

0 ·

CC 4.0 BY-SA版权

!!!Interview Topics 专栏收录该内容

11 篇文章

订阅专栏

本文深入探讨了排序算法的基本概念、性能指标、经典算法实现及其应用实例，旨在为开发者提供全面的排序算法知识框架。

Sort

http://en.wikipedia.org/wiki/Sorting_algorithm

Sorting Algorithms
Name	Best	Avg.	Worst	Memory	Stable	Note	Chapter
Insertion Sort	n	n^2	n^2	1	Yes		7.2
Bubble Sort	n	n^2	n^2	1	Yes		Wiki
Selection Sort	n^2	n^2	n^2	1	No
Heap Sort	nlogn	nlogn	nlogn	1	No		7.5
Merge Sort	nlogn	nlogn	nlogn	n	Yes		7.6
Quick Sort	nlogn	nlogn	n^2	log(n)	Depends		7.7
Counting Sort	n + k	n + k	n + k	n + k	Yes	k is data range	7.10
LSD Radix Sort			n*k/d	n	Yes	k/d is the number of rounds

n, the number of items to be sorted, k, the size of each key, and d, the digit size used by the implementation.

Insertion Sort

Wiki

void InsertionSort(int a[], int n)
{
    int i,p;
    int tmp;
    for(p = 1; p < n; ++p)
    {
        tmp = a[p];
        for(i = p; i > 0 && a[i-1] > tmp; --i)
            a[i] = a[i-1];
        a[i] = tmp;
    }
}

Bubble Sort

Wiki

void BubbleSort(int a[], int n)
{
    bool swapped = true;
    int j = 0;
    int tmp;
    while (swapped) 
    {
        swapped = false;
        ++j;
        for (int i = 0; i < n - j; ++i) 
        {
            if (a[i] > a[i+1]) 
            {
                tmp = a[i];
                a[i] = a[i+1];
                a[i+1] = tmp;
                swapped = true;
            }
        }
    }
}

Selection Sort

Wiki

void SelectionSort(int a[], int n)
{
    int i, j;
    int iMin;
    for (j = 0; j < n-1; ++j) 
    {
        iMin = j;
        for (i = j+1; i < n; i++) 
        {
            if (a[i] < a[iMin]) 
            {
                iMin = i;
            }
        }
        if (iMin != j) 
        {
            int tmp = a[j];
            a[j] = a[iMin];
            a[iMin] = tmp;
        }
    }
}

Merge Sort

Wiki

void merge(int a[], int tmp[], int left, int right, int rightEnd)
{
    int leftEnd = right - 1;
    int tmpPosition = left;
    int elementNumber = rightEnd - left + 1;
    
    while (left <= leftEnd && right <= rightEnd)
    {
        if (a[left] < a[right]) 
            tmp[tmpPosition++] = a[left++];
        else
            tmp[tmpPosition++] = a[right++];
    }
    
    while (left <= leftEnd)
        tmp[tmpPosition++] = a[left++];
    
    while (right <= rightEnd)
        tmp[tmpPosition++] = a[right++];
    
    for (int i = 0;  i < elementNumber; ++i, rightEnd--)   // the first time, rightEnd will not decrease
        a[rightEnd] = tmp[rightEnd];
}

void mSort(int a[], int tmp[], int left, int right)
{
    int center;
    if (left < right)
    {
        center = left + (right - left)/2;
        mSort(a, tmp, left, center);
        mSort(a, tmp, center + 1, right);
        merge(a, tmp, left, center + 1, right);
    }
}

void mergeSort(int a[], int n)
{
    int *tmp;
    tmp = new int[n];
    if (tmp != NULL)
    {
        mSort(a, tmp, 0, n-1);
        delete[] tmp;
    }
}

Quick Sort

Wiki

优快云 (How to pick pivot, Partitioning strategy)

Optimize

Back off to insertion sort, which has a smaller constant factor and is thus faster on small arrays, for invocations on such small arrays.

void swap(int *a, int *b)
{
    int c = *a;
    *a = *b;
    *b = c;
}

int partition(int a[], int left, int right)
{
    int pivot, pivotPostion;
    pivotPostion = left + rand()%(right - left+1);
    pivot = a[pivotPostion];
    swap(&a[pivotPostion], &a[right]);            // put the pivot to the right
    
    int i = left - 1;
    int j = right;
    while(1)
    {
        while (a[++i] < pivot) ;
        while (a[--j] > pivot) ;
        if (i < j)
            swap(&a[i], &a[j]);
        else
            break;
    }
    
    swap(&a[i], &a[right]);                      // if put pivot on right, swap with i; otherwise, swap with j
    return i;
}

void qSort(int a[], int left, int right)
{
    if (left < right) {
        int i = partition(a, left, right);
        qSort(a, left, i - 1);
        qSort(a, i + 1, right);
    }
}

void quickSort(int a[], int n)
{
    srand(time(NULL));
    qSort(a, 0, n-1);
}

Counting Sort

Wiki

YouTube Video

In computer science, counting sort is an algorithm for sorting a collection of objects according to keys that are small integers; that is, it is an integer sorting algorithm. It operates by counting the number of objects that haveeach distinct key value, and using arithmetic on those counts to determine the positions of each key value in the output sequence.

Its running time is linear in the number of items and the difference between the maximum and minimum key values, so it is only suitable for direct use in situations where the variation in keys is not significantly greater than the number of items. However, it is often used as a subroutine in another sorting algorithm, radix sort, that can handle larger keys more efficiently.

// a[i] >= 0 !!!
void countingSort(int a[], int n, int k)     // n is array size, k is maximum value
{
    int *count = new int[k+1]();             // value initialize to 0
    int *result = new int[n]();
    for (int i = 0; i < n; ++i) 
        count[a[i]] += 1;
    
    for (int i = 1; i < n; ++i)              // 0 must be at a[0]!
        count[i] += count[i - 1];
    
    for (int i = 0; i < n; ++i) {
        result[ count[a[i]] - 1] = a[i];
        count[a[i]]--;
    }
    
    for (int i = 0; i < n; ++i)
        a[i] = result[i];
    
    delete[] count;
    delete[] result;
}

Radix Sort (LSD)

Each key is first figuratively dropped into one level of buckets corresponding to the value of the rightmost digit. Each bucket preserves the original order of the keys as the keys are dropped into the bucket. There is a one-to-one correspondence between the number of buckets and the number of values that can be represented by a digit. Then, the process repeats with the next neighboring digit until there are no more digits to process. In other words:

Take the least significant digit (or group of bits, both being examples of radices) of each key.
Group the keys based on that digit, but otherwise keep the original order of keys. (This is what makes the LSD radix sort astable sort).
Repeat the grouping process with each more significant digit.

The sort in step 2 is usually done using bucket sort or counting sort, which are efficient in this case since there are usually only a small number of digits.

A simple version of an LSD radix sort can be achieved using queues as buckets. The following process is repeated for a number of times equal to the length of the longest key:

The integers are enqueued into an array of ten separate queues based on their digits from right to left. Computers often represent integers internally as fixed-length binary digits. Here, we will do something analogous with fixed-length decimal digits. So, using the numbers from the previous example, the queues for the 1st pass would be:

0: 17 0, 09 0

1: none

2: 00 2, 80 2

3: none

4: 02 4

5: 04 5, 07 5

6: 06 6

7–9: none
The queues are dequeued back into an array of integers, in increasing order. Using the same numbers, the array will look like this after the first pass:

170, 090, 002, 802, 024, 045, 075, 066
For the second pass:

Queues:

0: 0 02, 8 02

1: none

2: 0 24

3: none

4: 0 45

5: none

6: 0 66

7: 1 70, 0 75

8: none

9: 0 90

Array:

002, 802, 024, 045, 066, 170, 075, 090
(note that at this point only 802 and 170 are out of order)
For the third pass:

Queues:

0: 002, 024, 045, 066, 075, 090

1: 170

2–7: none

8: 802

9: none

Array:

002, 024, 045, 066, 075, 090, 170, 802 (sorted)

External Sort

Wiki

External sorting is a term for a class of sorting algorithms that can handle massive amounts of data. External sorting is required when the data being sorted do not fit into themain memory of a computing device (usually RAM) and instead they must reside in the slower external memory (usually a hard drive). External sorting typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of data small enough to fit in main memory are read, sorted, and written out to a temporary file. In the merge phase, the sorted subfiles are combined into a single larger file.

One example of external sorting is the external merge sort algorithm, which sorts chunks that each fit in RAM, then merges the sorted chunks together. For example, for sorting 900 megabytes of data using only 100 megabytes of RAM:

Read 100 MB of the data in main memory and sort by some conventional method, like quicksort.
Write the sorted data to disk.
Repeat steps 1 and 2 until all of the data is in sorted 100 MB chunks (there are 900MB / 100MB = 9 chunks), which now need to be merged into one single output file.
Read the first 10 MB (= 100MB / (9 chunks + 1)) of each sorted chunk into input buffers in main memory and allocate the remaining 10 MB for an output buffer. (In practice, it might provide better performance to make the output buffer larger and the input buffers slightly smaller.)
Perform a 9-way merge and store the result in the output buffer. Whenever the output buffer fills, write it to the final sorted file and empty it. Whenever any of the 9 input buffers empties, fill it with the next 10 MB of its associated 100 MB sorted chunk until no more data from the chunk is available. This is the key step that makes external merge sort work externally -- because the merge algorithm only makes one pass sequentially through each of the chunks, each chunk does not have to be loaded completely; rather, sequential parts of the chunk can be loaded as needed.

Binary Search

// return the position of the desired value
int BinarySearchR(int a[], int begin, int end, int value)
{
    if (begin > end) 
        return -1;       // return -1 if the value is not in the array

    int mid = (begin + end)/2;
    if (value < a[mid])
        return BinarySearchR(a, begin, mid - 1, value);
    else if (value > a[mid])
        return BinarySearchR(a, mid + 1, end, value);
    else
        return mid;
}


int BinarySearchNR(int a[], int begin, int end, int value)
{
    while (begin <= end) {
        int mid = (begin + end)/2;
        if (value < a[mid]) 
            end = mid - 1;
        else if(value > a[mid])
            begin = mid + 1;
        else
            return mid;
    }
    return -1;               // the value is not found
}