【C++进阶】map和set

原创已于 2025-11-13 22:36:33 修改 · 1k 阅读

23 ·

CC 4.0 BY-SA版权

文章标签：

#c++ #java #开发语言

于 2025-11-12 22:13:50 首次发布

C++进阶专栏收录该内容

7 篇文章

订阅专栏

【C++进阶】理解map和set

1. 关联式容器概述

1.1 序列式容器 vs 关联式容器

在C++ STL中，容器分为两大类：

序列式容器：vector、list、deque、forward_list等，底层为线性序列结构，存储元素本身
关联式容器：map、set、multimap、multiset等，存储键值对，数据检索效率更高

1.2 键值对(pair)

键值对是关联式容器的基本存储单元，表示一一对应关系：

template <class T1, class T2>
struct pair {
    typedef T1 first_type;
    typedef T2 second_type;
    
    T1 first;
    T2 second;
    
    pair(): first(T1()), second(T2()) {}
    pair(const T1& a, const T2& b): first(a), second(b) {}
};

2. 树形结构的关联式容器

STL提供了四种树形结构的关联式容器，底层都采用红黑树实现：

容器	特点	键值对	键是否唯一
set	存储value，自动排序	<value, value>	是
map	存储key-value，按键排序	<key, value>	是
multiset	存储value，可重复	<value, value>	否
multimap	存储key-value，key可重复	<key, value>	否

3. set容器详解

3.1 set的基本特性

template <class T, 
          class Compare = less<T>,
          class Alloc = allocator<T>> 
class set;

核心特性：

元素value就是key，必须唯一
元素总是const，不能修改
自动排序（默认升序）
底层使用红黑树实现
查找时间复杂度：O(log₂N)

3.2 set的使用示例

#include <set>
#include <iostream>
using namespace std;

void TestSet() {
    // 构造set
    int array[] = {1, 3, 5, 7, 9, 2, 4, 6, 8, 0, 1, 3, 5, 7, 9, 2, 4, 6, 8, 0};
    set<int> s(array, array + sizeof(array) / sizeof(array[0]));
    
    cout << "set大小: " << s.size() << endl;  // 输出: 10（自动去重）
    
    // 正向遍历（有序输出）
    cout << "升序遍历: ";
    for (auto& e : s) 
        cout << e << " ";  // 输出: 0 1 2 3 4 5 6 7 8 9
    cout << endl;
    
    // 反向遍历
    cout << "降序遍历: ";
    for (auto it = s.rbegin(); it != s.rend(); ++it)
        cout << *it << " ";  // 输出: 9 8 7 6 5 4 3 2 1 0
    cout << endl;
    
    // 查找操作
    cout << "元素3出现次数: " << s.count(3) << endl;  // 输出: 1
    
    // 插入操作
    auto ret = s.insert(10);
    if (ret.second) 
        cout << "插入10成功" << endl;
    
    // 删除操作
    s.erase(5);
    cout << "删除5后大小: " << s.size() << endl;
}

3.3 set的常用接口

操作	函数声明	说明
构造	`set()`	空构造函数
插入	`pair<iterator,bool> insert(const value_type& x)`	返回插入位置和是否成功
删除	`size_type erase(const key_type& x)`	返回删除元素个数
查找	`iterator find(const key_type& x)`	找到返回迭代器，否则返回end()
计数	`size_type count(const key_type& x)`	返回元素个数（0或1）

4. map容器详解

4.1 map的基本特性

template <class Key, 
          class T,
          class Compare = less<Key>,
          class Alloc = allocator<pair<const Key, T>>>
class map;

核心特性：

存储真正的键值对 <key, value>
key唯一且不可修改
按键自动排序
支持下标访问 operator[]
底层使用红黑树实现

4.2 map的使用示例

#include <map>
#include <string>
#include <iostream>
using namespace std;

void TestMap() {
    map<string, string> m;
    
    // 三种插入方式
    m.insert(pair<string, string>("peach", "桃子"));     // 直接构造pair
    m.insert(make_pair("banana", "香蕉"));              // 使用make_pair
    m["apple"] = "苹果";                                // 使用operator[]
    
    cout << "map大小: " << m.size() << endl;  // 输出: 3
    
    // 遍历map（按键排序）
    for (auto& e : m)
        cout << e.first << " --- " << e.second << endl;
    // 输出: apple --- 苹果
    //       banana --- 香蕉  
    //       peach --- 桃子
    
    // 重复插入测试
    auto ret = m.insert(make_pair("peach", "桃色"));
    if (!ret.second)
        cout << "peach已存在，插入失败" << endl;
    
    // 查找和删除
    if (m.find("apple") != m.end()) {
        m.erase("apple");
        cout << "成功删除apple" << endl;
    }
    
    // operator[]的特殊行为
    cout << "orange对应的值: " << m["orange"] << endl;  // 自动插入orange
    cout << "现在map大小: " << m.size() << endl;        // 输出: 4
}

4.3 operator[]的底层原理

operator[] 的实现相当于：

mapped_type& operator[](const key_type& k) {
    // 1. 用<k, T()>构造键值对
    // 2. 调用insert()插入
    // 3. 返回对应value的引用
    return (*((this->insert(make_pair(k, mapped_type()))).first)).second;
}

特点：

key存在：返回对应value的引用
key不存在：插入新键值对，value使用默认构造

5. multiset和multimap

5.1 multiset的使用

void TestMultiSet() {
    int array[] = {2, 1, 3, 9, 6, 0, 5, 8, 4, 7, 2, 3, 1};
    multiset<int> ms(array, array + sizeof(array) / sizeof(array[0]));
    
    for (auto& e : ms)
        cout << e << " ";  // 输出: 0 1 1 2 2 3 3 4 5 6 7 8 9
    cout << endl;
    
    cout << "元素2出现次数: " << ms.count(2) << endl;  // 输出: 2
}

5.2 multimap的特点

key可以重复
没有重载 operator[]（因为key不唯一）
其他接口与map类似

6. 实际应用案例

6.1 前K个高频单词

class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        // 统计单词频率
        map<string, int> freqMap;
        for (auto& word : words)
            freqMap[word]++;
        
        // 按频率排序（使用multiset自定义比较器）
        auto cmp = [](const pair<string, int>& a, const pair<string, int>& b) {
            return a.second > b.second || 
                  (a.second == b.second && a.first < b.first);
        };
        
        multiset<pair<string, int>, decltype(cmp)> sortedWords(freqMap.begin(), freqMap.end(), cmp);
        
        // 取前k个
        vector<string> result;
        auto it = sortedWords.begin();
        for (int i = 0; i < k && it != sortedWords.end(); ++i, ++it)
            result.push_back(it->first);
        
        return result;
    }
};

6.2 求两个数组的交集

class Solution {
public:
    vector<int> intersection(vector<int>& nums1, vector<int>& nums2) {
        set<int> s1(nums1.begin(), nums1.end());
        set<int> s2(nums2.begin(), nums2.end());
        
        vector<int> result;
        auto it1 = s1.begin(), it2 = s2.begin();
        
        // 双指针求交集
        while (it1 != s1.end() && it2 != s2.end()) {
            if (*it1 < *it2) {
                ++it1;
            } else if (*it2 < *it1) {
                ++it2;
            } else {
                result.push_back(*it1);
                ++it1;
                ++it2;
            }
        }
        return result;
    }
};

7. 底层数据结构

7.1 为什么需要平衡二叉树

普通二叉搜索树在极端情况下会退化为链表：

// 最坏情况：插入有序序列 1,2,3,4,5,6,7
// 树结构：
//     1
//      \
//       2
//        \
//         3
//          \
//           4
//            \
//             5

查找时间复杂度从 O(logN) 退化为 O(N)，因此需要平衡二叉树。

7.2 AVL树

7.2.1 AVL树概念

AVL树是高度平衡的二叉搜索树：

左右子树高度差（平衡因子）绝对值 ≤ 1
任何节点的两个子树高度最大差别为1
搜索时间复杂度：O(logN)

7.2.2 AVL树节点定义

template<class T>
struct AVLTreeNode {
    AVLTreeNode(const T& data)
        : _pLeft(nullptr), _pRight(nullptr), _pParent(nullptr)
        , _data(data), _bf(0) {}
    
    AVLTreeNode<T>* _pLeft;    // 左孩子
    AVLTreeNode<T>* _pRight;   // 右孩子  
    AVLTreeNode<T>* _pParent;  // 父节点
    T _data;                   // 节点值
    int _bf;                   // 平衡因子
};

7.2.3 AVL树的旋转

当平衡因子绝对值 > 1时，需要通过旋转恢复平衡：

右单旋（LL型）
- 插入在左子树的左侧
- 以父节点为中心向右旋转
左单旋（RR型）
- 插入在右子树的右侧
- 以父节点为中心向左旋转
左右双旋（LR型）
- 插入在左子树的右侧
- 先左旋再右旋
右左双旋（RL型）
- 插入在右子树的左侧
- 先右旋再左旋

7.3 红黑树

7.3.1 红黑树概念

红黑树是近似平衡的二叉搜索树，通过对节点着色来保证平衡：

每个节点是红色或黑色
根节点是黑色
红色节点的子节点必须是黑色（不能有连续红色节点）
从任一节点到其每个叶子的所有路径包含相同数目的黑色节点
叶子节点（NIL节点）是黑色的

7.3.2 红黑树节点定义

enum Color { RED, BLACK };

template<class ValueType>
struct RBTreeNode {
    RBTreeNode(const ValueType& data = ValueType(), Color color = RED)
        : _pLeft(nullptr), _pRight(nullptr), _pParent(nullptr)
        , _data(data), _color(color) {}
    
    RBTreeNode<ValueType>* _pLeft;
    RBTreeNode<ValueType>* _pRight; 
    RBTreeNode<ValueType>* _pParent;
    ValueType _data;
    Color _color;
};

7.3.3 红黑树插入调整

插入新节点（默认红色）后，可能违反红黑树性质，需要调整：

情况1：父节点和叔叔节点都是红色

将父节点和叔叔节点变黑
祖父节点变红
将祖父节点作为当前节点继续调整

情况2：父节点红色，叔叔节点黑色

当前节点是父节点的右孩子，父节点是祖父节点的左孩子：左旋
当前节点是父节点的左孩子，父节点是祖父节点的右孩子：右旋

情况3：父节点红色，叔叔节点黑色

当前节点和父节点同侧：变色+旋转

7.3.4 红黑树 vs AVL树

特性	AVL树	红黑树
平衡标准	严格平衡	近似平衡
查找性能	O(logN)	O(logN)
插入删除	旋转次数多	旋转次数少
适用场景	查询多，修改少	综合性能好
STL应用	无	map、set等

8. 总结

8.1 容器选择建议

需求场景	推荐容器	理由
需要有序存储	set/map	自动排序
允许重复元素	multiset/multimap	支持重复key
频繁查找	所有树形容器	O(logN)查找
频繁插入删除	红黑树容器	相对平衡开销小
只需要判断存在	set	简单高效