Collections.sort与Arrrays.sort

本文详细分析了Java 8中Collections.sort和List.sort的排序实现,主要使用了TimSort算法。TimSort是一种稳定排序,它结合了归并排序和插入排序,尤其在处理部分有序数据时表现出色。在处理小数组时,使用二分插入排序,对于大数组,首先找到自然的有序序列(run),然后进行高效的归并。在归并过程中,采用了飞奔模式(gallop)减少不必要的比较和元素移动,提高效率。在归并过程中,TimSort还会尝试合并连续的run,以避免长序列与短序列的归并。总结来说,TimSort在保证稳定性的同时,通过优化策略提高了排序性能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

概述

来看看Collections.sort与Arrrays.sort的排序算法,本文基于JDK1.8。

Collections.sort

容器工具类Collections.sort提供了以下几种方式:

public static <T extends Comparable<? super T>> void sort(List<T> list) {
    list.sort(null);
}

public static <T> void sort(List<T> list, Comparator<? super T> c) {
    list.sort(c);
}

可以看到以上两个静态排序方法本质的排序算法其实是一样的(都调用了list.sort方法),只是排序中比较的方法不同:

  1. 第一种排序方法,由于不用传入比较器,因此排序的List本身需要实现Comparable,实现接口中的compareTo方法用于比较;
  2. 第二种排序方法,需要传入比较器Comparator,借用比较器的compare方法来协助排序。

List.sort

Collections.sort的排序交由List.sort实现

default void sort(Comparator<? super E> c) {
    Object[] a = this.toArray();
    Arrays.sort(a, (Comparator) c);
    ListIterator<E> i = this.listIterator();
    for (Object e : a) {
        i.next();
        i.set((E) e);
    }
}

来看看Arrays.sort(T[] a, Comparator<? super T> c)

public static <T> void sort(T[] a, Comparator<? super T> c) {
    if (c == null) {
        sort(a);
    } else {
        if (LegacyMergeSort.userRequested)
            legacyMergeSort(a, c);
        else
            TimSort.sort(a, 0, a.length, c, null, 0, 0);
    }
}

不传入比较器时调用Arrays.sort(Object[] a)

public static void sort(Object[] a) {
    if (LegacyMergeSort.userRequested)
        legacyMergeSort(a);
    else
        ComparableTimSort.sort(a, 0, a.length, null, 0, 0);
}

这里LegacyMergeSort.userRequested是Arrays的内部来LegacyMergeSort的一个静态属性userRequested

/**
 * Old merge sort implementation can be selected (for
 * compatibility with broken comparators) using a system property.
 * Cannot be a static boolean in the enclosing class due to
 * circular dependencies. To be removed in a future release.
 */
static final class LegacyMergeSort {
    private static final boolean userRequested =
        java.security.AccessController.doPrivileged(
            new sun.security.action.GetBooleanAction(
                "java.util.Arrays.useLegacyMergeSort")).booleanValue();
}

这个系统参数值可通过

System.setProperty("java.util.Arrays.useLegacyMergeSort", "true"); 

方式设置,即使用旧的归并排序算法legacyMergeSort:

/** To be removed in a future release. */
private static void legacyMergeSort(Object[] a) {
    Object[] aux = a.clone();
    mergeSort(aux, a, 0, a.length, 0);
}

/**
 * Src is the source array that starts at index 0
 * Dest is the (possibly larger) array destination with a possible offset
 * low is the index in dest to start sorting
 * high is the end index in dest to end sorting
 * off is the offset to generate corresponding low, high in src
 * To be removed in a future release.
 */
//src 待排数组的副本,用于归并排序中间结果存放
//dest 真正要排序的数组,最后排完序后的也是该数组
//low 排序起始下标(包含)
//high 排序结束下标(不包含)
//off 偏移量,在这里始终为0
@SuppressWarnings({"unchecked", "rawtypes"})
private static void mergeSort(Object[] src,
                              Object[] dest,
                              int low,
                              int high,
                              int off) {
    int length = high - low;

    // 当待排元素个数<INSERTIONSORT_THRESHOLD=7时,使用简单选择排序
    if (length < INSERTIONSORT_THRESHOLD) {
        for (int i=low; i<high; i++)
            for (int j=i; j>low &&
                     ((Comparable) dest[j-1]).compareTo(dest[j])>0; j--)
                swap(dest, j, j-1);
        return;
    }

    // Recursively sort halves of dest into src
    int destLow  = low;
    int destHigh = high;
    low  += off;
    high += off;
    int mid = (low + high) >>> 1;
    //递归子序列排序
    mergeSort(dest, src, low, mid, -off);
    mergeSort(dest, src, mid, high, -off);

    // If list is already sorted, just copy from src to dest.  This is an
    // optimization that results in faster sorts for nearly ordered lists.
    if (((Comparable)src[mid-1]).compareTo(src[mid]) <= 0) {
        System.arraycopy(src, low, dest, destLow, length);
        return;
    }

    // Merge sorted halves (now in src) into dest
    for(int i = destLow, p = low, q = mid; i < destHigh; i++) {
        if (q >= high || p < mid && ((Comparable)src[p]).compareTo(src[q])<=0) //相当于 if (q >= high || (p < mid && ((Comparable)src[p]).compareTo(src[q])<=0))
            //当q >= high 或 src[p] <= src[q]
            dest[i] = src[p++];
        else
            //当p >= mid 或 src[p] > src[q]
            dest[i] = src[q++];
    }
}

可以看到以上归并排序算法是改进的归并排序,在待排元素数<7时,就用简单插入排序,否则采用归并排序。

当java.util.Arrays.useLegacyMergeSort=false时(默认),调用ComparableTimSort.sort(Object[] a, int lo, int hi, Object[] work, int workBase, int workLen)

static void sort(Object[] a, int lo, int hi, Object[] work, int workBase, int workLen) {
    //校验参数合法性
    //assert 逻辑表达式;逻辑表达式结果若为true,断言通过,继续执行以下代码;否则,在打开虚拟机参数-ea条件下,抛出java.lang.AssertionError
    assert a != null && lo >= 0 && lo <= hi && hi <= a.length;

    //待排元素数
    int nRemaining  = hi - lo;
    if (nRemaining < 2)
        return;  // Arrays of size 0 and 1 are always sorted

    // If array is small, do a "mini-TimSort" with no merges
    if (nRemaining < MIN_MERGE) {
        //获取数组a从下标lo开始已经有序的元素个数
        int initRunLen = countRunAndMakeAscending(a, lo, hi);
        binarySort(a, lo, hi, lo + initRunLen);//二分插入排序
        return;
    }

    /**
     * March over the array once, left to right, finding natural runs,
     * extending short natural runs to minRun elements, and merging runs
     * to maintain stack invariant.
     */
    ComparableTimSort ts = new ComparableTimSort(a, work, workBase, workLen);
    int minRun = minRunLength(nRemaining);
    do {
        // Identify next run
        int runLen = countRunAndMakeAscending(a, lo, hi);

        // If run is short, extend to min(minRun, nRemaining)
        if (runLen < minRun) { //run不满足最短run长度时,二分插入扩展run
            int force = nRemaining <= minRun ? nRemaining : minRun;
            binarySort(a, lo, lo + force, lo + runLen);
            runLen = force;
        }

        // Push run onto pending-run stack, and maybe merge
        ts.pushRun(lo, runLen);
        ts.mergeCollapse(); //高效率归并部分run

        // Advance to find next run
        lo += runLen;
        nRemaining -= runLen;
    } while (nRemaining != 0);

    // Merge all remaining runs to complete sort
    assert lo == hi;
    ts.mergeForceCollapse(); //强制归并所有的run
    assert ts.stackSize == 1;
}

//获取a[lo] ~ a[hi-1]中从下标lo开始已经有序的元素个数
//反序也满足,若反序,会通过reverseRange方法逆置数组元素
private static int countRunAndMakeAscending(Object[] a, int lo, int hi) {
    assert lo < hi;
    int runHi = lo + 1;
    //若元素个数为1,则有序元素个数为1
    if (runHi == hi)
        return 1;

    // Find end of run, and reverse range if descending
    //逆序
    if (((Comparable) a[runHi++]).compareTo(a[lo]) < 0) { // Descending
        //统计最左连续逆序的元素个数
        while (runHi < hi && ((Comparable) a[runHi]).compareTo(a[runHi - 1]) < 0)
            runHi++;
        reverseRange(a, lo, runHi); //因为是逆序,所以要逆置为有序
    } else {                              // Ascending
        //统计最左连续有序的元素个数
        while (runHi < hi && ((Comparable) a[runHi]).compareTo(a[runHi - 1]) >= 0)
            runHi++;
    }

    return runHi - lo;
}

//逆置a[lo] ~ a[hi]
private static void reverseRange(Object[] a, int lo, int hi) {
    hi--;
    while (lo < hi) {
        Object t = a[lo];
        a[lo++] = a[hi];
        a[hi--] = t;
    }
}

//用二分插入排序算法对a[lo] ~ a[hi-1]进行排序,排序起始下标为start
private static void binarySort(Object[] a, int lo, int hi, int start) {
    assert lo <= start && start <= hi;
    if (start == lo)
        start++;
    for ( ; start < hi; start++) { //从start开旭排序
        Comparable pivot = (Comparable) a[start]; //待排元素

        // Set left (and right) to the index where a[start] (pivot) belongs
        int left = lo;
        int right = start;
        assert left <= right;
        /*
         * Invariants:
         *   pivot >= all in [lo, left).
         *   pivot <  all in [right, start).
         */
        //二分查找pivot应该插入的位置,最后left即为pivot最终在数组的下标
        while (left < right) {
            int mid = (left + right) >>> 1;
            if (pivot.compareTo(a[mid]) < 0)
                right = mid;
            else
                left = mid + 1;
        }
        assert left == right;

        /*
         * The invariants still hold: pivot >= all in [lo, left) and
         * pivot < all in [left, start), so pivot belongs at left.  Note
         * that if there are elements equal to pivot, left points to the
         * first slot after them -- that's why this sort is stable.
         * Slide elements over to make room for pivot.
         */
        //需要向后移动的元素个数
        int n = start - left;  // The number of elements to move
        // Switch is just an optimization for arraycopy in default case
        //移动个数小时,直接移动数组元素,否则用arraycopy
        switch (n) {
            case 2:  a[left + 2] = a[left + 1];
            case 1:  a[left + 1] = a[left];
                     break;
            default: System.arraycopy(a, left, a, left + 1, n); //将数组a从left开始的n个元素复制到left+1上
        }
        a[left] = pivot;
    }
}

把上面sort方法中TimSort部分抽绎出来着重看:

static void sort(Object[] a, int lo, int hi, Object[] work, int workBase, int workLen) {
    ......
    //上面是待排元素少时的二分插入排序
    //下面才是TimSort
    //TImSort中分区run是已有序的子序列

    /**
     * March over the array once, left to right, finding natural runs,
     * extending short natural runs to minRun elements, and merging runs
     * to maintain stack invariant.
     */
    ComparableTimSort ts = new ComparableTimSort(a, work, workBase, workLen); //这里work=null,workBase=workLen=0
    //根据待排元素数获取分区run最小长度,使大部分的run的长度达到均衡,有助于后面run的合并操作
    //nRemaining<32时,minRun=nRemaining
    //nRemaining>=32且nRemaining=2^n时,minRun=16
    //nRemaining>=32且nRemaining!=2^n时,16<=minRun<=32
    int minRun = minRunLength(nRemaining); //nRemaining=hi-lo,即初始值为待排元素数
    do {
        //获得下一run长度
        //数组a从lo开始的有序子序列长度(严格逆序可通过倒置,也满足有序条件)
        int runLen = countRunAndMakeAscending(a, lo, hi);

        // If run is short, extend to min(minRun, nRemaining)
        if (runLen < minRun) { //run长度小于最小run长度,需要用二分插入排序补足
            int force = nRemaining <= minRun ? nRemaining : minRun;//考虑待排数组最后一个run的特殊情况
            binarySort(a, lo, lo + force, lo + runLen);//二分插入排序所少的runLen-force个待排元素
            runLen = force; //这里force并不一定等于minRun,最后一个run可能不满足
        }

        // Push run onto pending-run stack, and maybe merge
        //run的起始下标和长度入栈
        ts.pushRun(lo, runLen);
        //高效率归并部分run
        ts.mergeCollapse();

        // Advance to find next run
        lo += runLen;//更新下次排序的起始位置
        nRemaining -= runLen; //更新未排元素数
    } while (nRemaining != 0);

    // Merge all remaining runs to complete sort
    assert lo == hi;
    ts.mergeForceCollapse(); //强制归并所有的run
    assert ts.stackSize == 1; //所有run归并后应只剩下一个有序包含所有元素的run
}

//将run的起始下标和长度保存到栈里
private void pushRun(int runBase, int runLen) {
    this.runBase[stackSize] = runBase;
    this.runLen[stackSize] = runLen;
    stackSize++; //stackSize为栈中的run数目
}

/**
 * Examines the stack of runs waiting to be merged and merges adjacent runs
 * until the stack invariants are reestablished:
 *
 *     1. runLen[i - 3] > runLen[i - 2] + runLen[i - 1]
 *     2. runLen[i - 2] > runLen[i - 1]
 *
 * This method is called each time a new run is pushed onto the stack,
 * so the invariants are guaranteed to hold for i < stackSize upon
 * entry to the method.
 */
//考虑到归并排序的效率问题,因为将一个长序列和一个短序列进行归并排序从效率和代价的角度来看是不划算的,而两个长度均衡的序列进行归并排序时才是比较合理的也比较高效的
private void mergeCollapse() {
    //栈中的run数量>1时,才需要合并
    while (stackSize > 1) {
        int n = stackSize - 2;
        //run数量>2且
        if (n > 0 && runLen[n-1] <= runLen[n] + runLen[n+1]) {
            if (runLen[n - 1] < runLen[n + 1])
                n--;
            mergeAt(n); //归并run[n]和run[n+1]
        } else if (runLen[n] <= runLen[n + 1]) {
            mergeAt(n);
        } else {
            break; // Invariant is established
        }
    }
}

/**
 * Returns the minimum acceptable run length for an array of the specified
 * length. Natural runs shorter than this will be extended with
 * {@link #binarySort}.
 *
 * Roughly speaking, the computation is:
 *
 *  If n < MIN_MERGE, return n (it's too small to bother with fancy stuff).
 *  Else if n is an exact power of 2, return MIN_MERGE/2.
 *  Else return an int k, MIN_MERGE/2 <= k <= MIN_MERGE, such that n/k
 *   is close to, but strictly less than, an exact power of 2.
 *
 * For the rationale, see listsort.txt.
 *
 * @param n the length of the array to be sorted
 * @return the length of the minimum run to be merged
 */
//根据待排元素数获取分区run最小长度
private static int minRunLength(int n) {
    assert n >= 0;
    int r = 0;      // Becomes 1 if any 1 bits are shifted off
    while (n >= MIN_MERGE) {
        r |= (n & 1);
        n >>= 1;
    }
    return n + r;
}

来看下mergeAt(int)方法:

/**
 * Merges the two runs at stack indices i and i+1.  Run i must be
 * the penultimate or antepenultimate run on the stack.  In other words,
 * i must be equal to stackSize-2 or stackSize-3.
 *
 * @param i stack index of the first of the two runs to merge
 */
@SuppressWarnings("unchecked")
//归并i和i+1 run,其中i=stackSize-2 或 stackSize-3
private void mergeAt(int i) {
    assert stackSize >= 2;
    assert i >= 0;
    assert i == stackSize - 2 || i == stackSize - 3;

    int base1 = runBase[i];
    int len1 = runLen[i];
    int base2 = runBase[i + 1];
    int len2 = runLen[i + 1];
    assert len1 > 0 && len2 > 0;
    assert base1 + len1 == base2; //归并的两个run必须连续的

    /*
     * Record the length of the combined runs; if i is the 3rd-last
     * run now, also slide over the last run (which isn't involved
     * in this merge).  The current run (i+1) goes away in any case.
     */
    runLen[i] = len1 + len2;
    //若i = stackSize - 3,因为要归并i和i+1,因此先删除i+1(i+2顶替i+1)
    if (i == stackSize - 3) {
        runBase[i + 1] = runBase[i + 2];
        runLen[i + 1] = runLen[i + 2];
    }
    stackSize--; //二合一后run数-1

    /*
     * Find where the first element of run2 goes in run1. Prior elements
     * in run1 can be ignored (because they're already in place).
     */
    //向a[base1]右边飞奔,二分查找a[base2]在run1中的位置
    //k是距离a[base1]的位置
    int k = gallopRight((Comparable<Object>) a[base2], a, base1, len1, 0);
    assert k >= 0;
    base1 += k;
    len1 -= k; //len1是距离a[base1+原len1-1]的位置
    if (len1 == 0) //相邻的两个run的a[base1]<=a[base2],已经有序
        return;

    /*
     * Find where the last element of run1 goes in run2. Subsequent elements
     * in run2 can be ignored (because they're already in place).
     */
    //向a[base2+len2-1]左边飞奔,二分查找a[base1+len1-1]在run2中的位置
    //len2是距离a[base2]的位置
    len2 = gallopLeft((Comparable<Object>) a[base1 + len1 - 1], a,
            base2, len2, len2 - 1);
    assert len2 >= 0;
    if (len2 == 0) //相邻的两个run的a[base1]<=a[base2],已经有序
        return;

    // Merge remaining runs, using tmp array with min(len1, len2) elements
    //合并run1和run2剩下的元素
    if (len1 <= len2)
        mergeLo(base1, len1, base2, len2);
    else
        mergeHi(base1, len1, base2, len2);
}

/**
 * Like gallopLeft, except that if the range contains an element equal to
 * key, gallopRight returns the index after the rightmost equal element.
 *
 * @param key the key whose insertion point to search for
 * @param a the array in which to search
 * @param base the index of the first element in the range
 * @param len the length of the range; must be > 0
 * @param hint the index at which to begin the search, 0 <= hint < n.
 *     The closer hint is to the result, the faster this method will run.
 * @return the int k,  0 <= k <= n such that a[b + k - 1] <= key < a[b + k]
 */
//飞奔模式:向右飞奔,查找key在a[base]~a[base+len-1]中的位置,这里hint=0
private static int gallopRight(Comparable<Object> key, Object[] a,
        int base, int len, int hint) {
    assert len > 0 && hint >= 0 && hint < len;

    int ofs = 1;
    int lastOfs = 0;
    //当key<a[base]时,ofs=0
    if (key.compareTo(a[base + hint]) < 0) {
        // Gallop left until a[b+hint - ofs] <= key < a[b+hint - lastOfs]
        int maxOfs = hint + 1;
        while (ofs < maxOfs && key.compareTo(a[base + hint - ofs]) < 0) {//ofs=maxOfs=1,不会进入循环
            lastOfs = ofs;
            ofs = (ofs << 1) + 1;
            if (ofs <= 0)   // int overflow
                ofs = maxOfs;
        }
        if (ofs > maxOfs)
            ofs = maxOfs;

        // Make offsets relative to b
        int tmp = lastOfs;
        lastOfs = hint - ofs; //-1
        ofs = hint - tmp;//0
    } else { // a[b + hint] <= key 主要处理key>=a[base]情形
        // Gallop right until a[b+hint + lastOfs] <= key < a[b+hint + ofs]
        int maxOfs = len - hint; //maxOfs=len
        //每次以2*ofs+1的步长飞奔使得a[base+lastOfs]<=key<a[base+ofs]或者ofs>=maxOfs(循环退出时key可能>=a[base+maxOfs] 可能<a[base+maxOfs])
        while (ofs < maxOfs && key.compareTo(a[base + hint + ofs]) >= 0) {
            lastOfs = ofs;
            ofs = (ofs << 1) + 1;
            if (ofs <= 0)   // int overflow
                ofs = maxOfs;
        }
        if (ofs > maxOfs)
            ofs = maxOfs;

        // Make offsets relative to b
        lastOfs += hint;
        ofs += hint;
    }
    assert -1 <= lastOfs && lastOfs < ofs && ofs <= len;

    /*
     * Now a[b + lastOfs] <= key < a[b + ofs], so key belongs somewhere to
     * the right of lastOfs but no farther right than ofs.  Do a binary
     * search, with invariant a[b + lastOfs - 1] <= key < a[b + ofs].
     */
    lastOfs++;
    //这里a[b + lastOfs] <= key < a[b + ofs](也可能[b + lastOfs] <= key >= a[b + ofs])
    //二分查找key在run1中的确切位置
    while (lastOfs < ofs) {
        int m = lastOfs + ((ofs - lastOfs) >>> 1);

        if (key.compareTo(a[base + m]) < 0)
            ofs = m;          // key < a[b + m]
        else //等于的判断逻辑放在了这里,即若是相等,取最后一个,保证稳定性
            lastOfs = m + 1;  // a[b + m] <= key
    }
    assert lastOfs == ofs;    // so a[b + ofs - 1] <= key < a[b + ofs]
    return ofs;
}

//同gallopRight类似
//飞奔模式:向左飞奔,查找key在a[base]~a[base+len-1]中的位置,这里hint=len-1
private static int gallopLeft(Comparable<Object> key, Object[] a,
        int base, int len, int hint) {
    assert len > 0 && hint >= 0 && hint < len;

    int lastOfs = 0;
    int ofs = 1;
    //key > a[base+len-1]
    if (key.compareTo(a[base + hint]) > 0) {
        // Gallop right until a[base+hint+lastOfs] < key <= a[base+hint+ofs]
        int maxOfs = len - hint; //maxOfs=1
        //不会进入循环
        while (ofs < maxOfs && key.compareTo(a[base + hint + ofs]) > 0) {
            lastOfs = ofs;
            ofs = (ofs << 1) + 1;
            if (ofs <= 0)   // int overflow
                ofs = maxOfs;
        }
        if (ofs > maxOfs)
            ofs = maxOfs;

        // Make offsets relative to base
        lastOfs += hint; //lastOfs=len-1
        ofs += hint; //ofs=len
    } else { // key <= a[base + hint] key<=a[base+len-1]
        // Gallop left until a[base+hint-ofs] < key <= a[base+hint-lastOfs]
        final int maxOfs = hint + 1; //maxOfs=len
        //每次以2*ofs+1的步长飞奔使得a[base+len-1-ofs]<key<=a[base+len-1-lastOfs]或者ofs>=maxOfs(循环退出时key可能>a[base+len-1-maxOfs] 可能<=a[base+len-1-maxOfs])
        while (ofs < maxOfs && key.compareTo(a[base + hint - ofs]) <= 0) {
            lastOfs = ofs;
            ofs = (ofs << 1) + 1;
            if (ofs <= 0)   // int overflow
                ofs = maxOfs;
        }
        if (ofs > maxOfs)
            ofs = maxOfs;

        // Make offsets relative to base
        int tmp = lastOfs;
        lastOfs = hint - ofs;//len-1-ofs
        ofs = hint - tmp; //len-1-lastOfs
    }
    assert -1 <= lastOfs && lastOfs < ofs && ofs <= len;

    /*
     * Now a[base+lastOfs] < key <= a[base+ofs], so key belongs somewhere
     * to the right of lastOfs but no farther right than ofs.  Do a binary
     * search, with invariant a[base + lastOfs - 1] < key <= a[base + ofs].
     */
    lastOfs++;
    //这里a[b + lastOfs] <= key < a[b + ofs](也可能[b + lastOfs] <= key >= a[b + ofs])
    //二分查找key在run2中的确切位置
    while (lastOfs < ofs) {
        int m = lastOfs + ((ofs - lastOfs) >>> 1);

        if (key.compareTo(a[base + m]) > 0)
            lastOfs = m + 1;  // a[base + m] < key
        else //等于的判断逻辑放在了这里,即若是相等,取第一个,保证稳定性
            ofs = m;          // key <= a[base + m]
    }
    assert lastOfs == ofs;    // so a[base + ofs - 1] < key <= a[base + ofs]
    return ofs;
}

//len1是a[base2]在run1中最终位置到run1右端的距离
//len2是a[base1+len1-1]在run2中最终位置到run2左端的距离
private void mergeLo(int base1, int len1, int base2, int len2) {
    assert len1 > 0 && len2 > 0 && base1 + len1 == base2;

    // Copy first run into temp array
    Object[] a = this.a; // For performance
    Object[] tmp = ensureCapacity(len1);

    int cursor1 = tmpBase; // Indexes into tmp array
    int cursor2 = base2;   // Indexes int a
    int dest = base1;      // Indexes int a
    System.arraycopy(a, base1, tmp, cursor1, len1); //复制a[base1] ~ a[base1 + len1 - 1]到tmp上

    // Move first element of second run and deal with degenerate cases
    a[dest++] = a[cursor2++];
    if (--len2 == 0) {
        System.arraycopy(tmp, cursor1, a, dest, len1);
        return;
    }
    if (len1 == 1) {
        System.arraycopy(a, cursor2, a, dest, len2);
        a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge
        return;
    }
    //上面两种情况处理 len1=1或len2=1这两简单情形
    
    int minGallop = this.minGallop;  // Use local variable for performance
outer:
    //dest=base1+1,cursor2=base2+1,len2=len2-1,cursor1 = tmpBase
    while (true) {
        int count1 = 0; // Number of times in a row that first run won
        int count2 = 0; // Number of times in a row that second run won

        /*
         * Do the straightforward thing until (if ever) one run starts
         * winning consistently.
         */
        do {
            assert len1 > 1 && len2 > 0;
            if (((Comparable) a[cursor2]).compareTo(tmp[cursor1]) < 0) {
                a[dest++] = a[cursor2++];
                count2++;
                count1 = 0;
                if (--len2 == 0)
                    break outer;
            } else {
                a[dest++] = tmp[cursor1++];
                count1++;
                count2 = 0;
                if (--len1 == 1)
                    break outer;
            }
        } while ((count1 | count2) < minGallop);

        /*
         * One run is winning so consistently that galloping may be a
         * huge win. So try that, and continue galloping until (if ever)
         * neither run appears to be winning consistently anymore.
         */
        do {
            assert len1 > 1 && len2 > 0;
            count1 = gallopRight((Comparable) a[cursor2], tmp, cursor1, len1, 0);
            if (count1 != 0) {
                System.arraycopy(tmp, cursor1, a, dest, count1);
                dest += count1;
                cursor1 += count1;
                len1 -= count1;
                if (len1 <= 1)  // len1 == 1 || len1 == 0
                    break outer;
            }
            a[dest++] = a[cursor2++];
            if (--len2 == 0)
                break outer;

            count2 = gallopLeft((Comparable) tmp[cursor1], a, cursor2, len2, 0);
            if (count2 != 0) {
                System.arraycopy(a, cursor2, a, dest, count2);
                dest += count2;
                cursor2 += count2;
                len2 -= count2;
                if (len2 == 0)
                    break outer;
            }
            a[dest++] = tmp[cursor1++];
            if (--len1 == 1)
                break outer;
            minGallop--;
        } while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
        if (minGallop < 0)
            minGallop = 0;
        minGallop += 2;  // Penalize for leaving gallop mode
    }  // End of "outer" loop
    this.minGallop = minGallop < 1 ? 1 : minGallop;  // Write back to field

    if (len1 == 1) {
        assert len2 > 0;
        System.arraycopy(a, cursor2, a, dest, len2);
        a[dest + len2] = tmp[cursor1]; //  Last elt of run 1 to end of merge
    } else if (len1 == 0) {
        throw new IllegalArgumentException(
            "Comparison method violates its general contract!");
    } else {
        assert len2 == 0;
        assert len1 > 1;
        System.arraycopy(tmp, cursor1, a, dest, len1);
    }
}

private void mergeHi(int base1, int len1, int base2, int len2) {
    assert len1 > 0 && len2 > 0 && base1 + len1 == base2;

    // Copy second run into temp array
    Object[] a = this.a; // For performance
    Object[] tmp = ensureCapacity(len2);
    int tmpBase = this.tmpBase;
    System.arraycopy(a, base2, tmp, tmpBase, len2);

    int cursor1 = base1 + len1 - 1;  // Indexes into a
    int cursor2 = tmpBase + len2 - 1; // Indexes into tmp array
    int dest = base2 + len2 - 1;     // Indexes into a

    // Move last element of first run and deal with degenerate cases
    a[dest--] = a[cursor1--];
    if (--len1 == 0) {
        System.arraycopy(tmp, tmpBase, a, dest - (len2 - 1), len2);
        return;
    }
    if (len2 == 1) {
        dest -= len1;
        cursor1 -= len1;
        System.arraycopy(a, cursor1 + 1, a, dest + 1, len1);
        a[dest] = tmp[cursor2];
        return;
    }

    int minGallop = this.minGallop;  // Use local variable for performance
outer:
    while (true) {
        int count1 = 0; // Number of times in a row that first run won
        int count2 = 0; // Number of times in a row that second run won

        /*
         * Do the straightforward thing until (if ever) one run
         * appears to win consistently.
         */
        do {
            assert len1 > 0 && len2 > 1;
            if (((Comparable) tmp[cursor2]).compareTo(a[cursor1]) < 0) {
                a[dest--] = a[cursor1--];
                count1++;
                count2 = 0;
                if (--len1 == 0)
                    break outer;
            } else {
                a[dest--] = tmp[cursor2--];
                count2++;
                count1 = 0;
                if (--len2 == 1)
                    break outer;
            }
        } while ((count1 | count2) < minGallop);

        /*
         * One run is winning so consistently that galloping may be a
         * huge win. So try that, and continue galloping until (if ever)
         * neither run appears to be winning consistently anymore.
         */
        do {
            assert len1 > 0 && len2 > 1;
            count1 = len1 - gallopRight((Comparable) tmp[cursor2], a, base1, len1, len1 - 1);
            if (count1 != 0) {
                dest -= count1;
                cursor1 -= count1;
                len1 -= count1;
                System.arraycopy(a, cursor1 + 1, a, dest + 1, count1);
                if (len1 == 0)
                    break outer;
            }
            a[dest--] = tmp[cursor2--];
            if (--len2 == 1)
                break outer;

            count2 = len2 - gallopLeft((Comparable) a[cursor1], tmp, tmpBase, len2, len2 - 1);
            if (count2 != 0) {
                dest -= count2;
                cursor2 -= count2;
                len2 -= count2;
                System.arraycopy(tmp, cursor2 + 1, a, dest + 1, count2);
                if (len2 <= 1)
                    break outer; // len2 == 1 || len2 == 0
            }
            a[dest--] = a[cursor1--];
            if (--len1 == 0)
                break outer;
            minGallop--;
        } while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
        if (minGallop < 0)
            minGallop = 0;
        minGallop += 2;  // Penalize for leaving gallop mode
    }  // End of "outer" loop
    this.minGallop = minGallop < 1 ? 1 : minGallop;  // Write back to field

    if (len2 == 1) {
        assert len1 > 0;
        dest -= len1;
        cursor1 -= len1;
        System.arraycopy(a, cursor1 + 1, a, dest + 1, len1);
        a[dest] = tmp[cursor2];  // Move first elt of run2 to front of merge
    } else if (len2 == 0) {
        throw new IllegalArgumentException(
            "Comparison method violates its general contract!");
    } else {
        assert len1 == 0;
        assert len2 > 0;
        System.arraycopy(tmp, tmpBase, a, dest - (len2 - 1), len2);
    }
}

来看下强制归并所有的方法mergeForceCollapse:

/**
 * Merges all runs on the stack until only one remains.  This method is
 * called once, to complete the sort.
 */
private void mergeForceCollapse() {
    while (stackSize > 1) {
        int n = stackSize - 2;
        if (n > 0 && runLen[n - 1] < runLen[n + 1])
            n--;
        mergeAt(n);
    }
}

至此,无比较器的排序已经分析完,有比较器的排序同无比较器的一样,只是比较方式变了而已。

总结

用途来表示Collections.sort排序的大致过程:
image

接下来看下默认使用的TimSort排序算法:
image
这里还要注意两点:

  1. 上图中mergeCollapse()方法尽可能多的归并run,尽量避免一个较长的有序片段和一个较小的有序片段进行归并,因为这样的效率比较低,
    即不满足不等式:
    runLen[i - 3] > runLen[i - 2] + runLen[i - 1]
    runLen[i - 2] > runLen[i - 1]
    则归并。
  2. 在归并两个run过程中同样做了优化,主要是采用了所谓的飞奔(gallop)模式,减少参与归并的数据长度。
    假设需要归并的两个已有序片段分别为X和Y,如果X片段的前m个元素都比Y片段的首元素小,那么这m个元素实际上是不需要参与归并的,因为归并后这m个元素仍然位于原来的位置。同理如果Y片段的最后n个元素都比X的最后一个元素大,那么Y的最后n个元素也不必参与归并。这样就减少了归并数组的长度,也减少了待排序数组与辅助数组之间数据来回复制的长度,进而提高了归并的效率。

参考:

Timsort原理介绍
简易版的TimSort排序算法
如何找出Timsort算法和玉兔月球车中的Bug?

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值