【图解】9张图彻底搞懂堆排序

戳蓝字“优快云云计算”关注我们哦!

作者 | CoderJed

责编 | 阿秃


1. 图示过程

大根堆的性质:

  1. 堆顶的数一定是所有元素的最大值

  2. 任何一颗子树的根元素一定是该子树的最大元素

  3. 某节点的左右叶子节点是无序的

大根堆与数组的关系:计算机中是没有堆或者树这种概念的,堆或者树需要使用基本的数据结构来实现,用数组表示一个大根堆的规律如下:

  1. 数组索引为 0 的位置存放堆顶的元素

  2. 数组索引为 i 的元素的左右叶子节点的索引是 2 * i + 1 和 2 * i + 2

  3. 数组索引为 i 的元素的父节点的下标索引为 (i - 1) / 2


(1) 堆排序整体流程

  1. 首先把数组中的 N 个数建成一个大小为 N 的大根堆

  1. 然后把堆顶的数和堆的最后一个数交换:

  1. 此时数组的最后一个值就是最大值

  1. 然后把推中的最后一个元素剔除,把剩余的元素再次调整为一个大根堆

  1. 然后把堆顶元素与最后一个元素交换位置

  1. 此时数组的倒数第二个元素就是数组中第二大的元素。

  1. 重复以上过程,当堆的大小为 1 的时候,数组就有序了。


(2) 堆化过程

将一个数组转化为一个大根堆的过程称为堆化,堆化的过程如下:

  1. 原数组对应的数结构为:

  1. 从第一个元素开始遍历,只要它的值比父节点大,就把它和父节点相互交换。


2. 展示

3. Java代码实现

public static void heapSort(int[] arr) {
    if (arr == null || arr.length < 2) {
        return;
    }
    for (int i = 0; i < arr.length; i++) {
        heapInsert(arr, i);
    }
    int size = arr.length;
    swap(arr, 0, --size);
    while (size > 0) {
        heapify(arr, 0, size);
        swap(arr, 0, --size);
    }
}

public static void heapInsert(int[] arr, int index) {
    while (arr[index] > arr[(index - 1) / 2]) {
        swap(arr, index, (index - 1) / 2);
        index = (index - 1) / 2;
    }
}

/**
 * 堆化
 */
public static void heapify(int[] arr, int index, int size) {
    int left = index * 2 + 1;
    while (left < size) {
        int largest = left + 1 < size && arr[left + 1] > arr[left] ? left + 1 : left;
        largest = arr[largest] > arr[index] ? largest : index;
        if (largest == index) {
            break;
        }
        swap(arr, largest, index);
        index = largest;
        left = index * 2 + 1;
    }
}

public static void swap(int[] arr, int i, int j) {
    int tmp = arr[i];
    arr[i] = arr[j];
    arr[j] = tmp;
}

4. 复杂度

  • 时间复杂度:O(nlogn)

  • 空间复杂度:O(1), 只需要一个额外的空间用于交换元素

  • 稳定性:堆排序无法保证相等的元素的相对位置不变,因此它是不稳定的排序算法

原文链接:jianshu.com/p/3e1d4ed98565

作者:小乐。专注于技术、算法、职场、感悟、面经、资讯分享给大家,打造一个有趣的公众号,10万+开发者都在看。微信公众号:程序员小乐(study_tech)。



福利

扫描添加小编微信,备注“姓名+公司职位”,加入【云计算学习交流群】,和志同道合的朋友们共同打卡学习!
推荐阅读:

真香,朕在看了!

### Flink Exactly-Once Semantics Explained In the context of stream processing, ensuring that each record is processed only once (exactly-once) without any loss or duplication becomes critical for applications requiring high accuracy and reliability. For this purpose, Apache Flink implements sophisticated mechanisms to guarantee exactly-once delivery semantics. #### Importance of Exactly-Once Processing Exactly-once processing ensures every message is consumed precisely one time by downstream systems, preventing both data loss and duplicate records[^3]. This level of assurance is particularly important when dealing with financial transactions, billing information, or other scenarios where even a single error can lead to significant issues. #### Implementation Mechanisms To achieve exactly-once guarantees, Flink employs several key technologies: 1. **Checkpointing**: Periodic snapshots are taken across all operators within a job graph at consistent points in time. These checkpoints serve as recovery states which allow jobs to resume from these saved positions upon failure. 2. **Two-phase commit protocol**: When interacting with external systems like databases or messaging queues through sinks, Flink uses an extended version of the two-phase commit transaction mechanism. During checkpoint creation, pre-commit actions prepare changes; after successful completion of the checkpoint process, global commits finalize those operations[^4]. ```mermaid graph LR; A[Start Transaction] --> B{Prepare Changes}; B --> C(Pre-Commit); C --> D{All Pre-commits Succeed?}; D -->|Yes| E(Global Commit); D -->|No| F(Abort); ``` This diagram illustrates how the two-phase commit works during sink operations. Each operator prepares its part before confirming globally whether everything has been successfully prepared. Only then does it proceed with committing or aborting based on consensus among participants. #### Barrier Insertion & Propagation For maintaining consistency between different parts of computation while taking periodic snapshots, barriers play a crucial role. They act as synchronization markers inserted into streams periodically according to configured intervals. As they propagate along with events throughout the topology, they ensure that no new elements enter until previous ones have completed their respective stages up till the barrier point. ```mermaid sequenceDiagram participant Source participant OperatorA participant OperatorB Note over Source: Time advances... Source->>OperatorA: Data Element 1 Source->>OperatorA: Checkpoint Barrier X Source->>OperatorA: Data Element 2 OperatorA->>OperatorB: Forwarded Elements + Barrier X Note right of OperatorB: Process pending items\nbefore handling next element post-barrier ``` The sequence above shows how barriers travel alongside regular data flow but enforce order so that computations remain synchronized despite asynchronous nature inherent in distributed environments. --related questions-- 1. What challenges arise when implementing exactly-once semantics in real-world applications? 2. How do checkpointing frequencies impact performance versus fault tolerance trade-offs? 3. Can you explain what happens if some nodes fail midway through a two-phase commit operation? 4. Are there alternative methods besides using barriers for achieving similar levels of consistency? 5. In practice, under what circumstances might at-least-once be preferred over exactly-once semantics?
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值