并查集的实现_fy2462-优快云博客

本文链接：https://blog.youkuaiyun.com/fy2462/article/details/31765909

并查集是一种树型数据结构，用于处理不相交集合的合并与查询。基本操作包括合并两个不相交集合和判断元素是否属于同一集合。通过路径压缩和按秩合并的优化，可以提高查询效率。文章提供了源代码实现并提到需要boost library支持。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1、概述

　　并查集（Disjoint set或者Union-find set）是一种树型的数据结构，常用于处理一些不相交集合（Disjoint Sets）的合并及查询问题。

2、基本操作

　　并查集是一种非常简单的数据结构，它主要未来解决如下两种经常性操作而产生的，分别为：

　　A．合并两个不相交集合

　　B．判断两个元素是否属于同一个集合（经常性）

（1）合并两个不相交集合

　　合并操作很简单：先设置一个数组Father[x]，表示x的“父亲”的编号。那么，合并两个不相交集合的方法就是，找到其中一个集合最父亲的父亲（也就是最久远的祖先），将另外一个集合的最久远的祖先的父亲指向它。

上图为两个不相交集合，b图为合并后Father(b):=Father(g)

（2）判断两个元素是否属于同一集合

　　本操作可转换为寻找两个元素的最久远祖先是否相同。可以采用递归实现。

3、优化

（1）路径压缩

寻找祖先时，我们一般采用递归查找，但是当元素很多亦或是整棵树变为一条链时，每次Find_Set(x)都是O(n)的复杂度。为了避免这种情况，我们需对路径进行压缩，即当我们经过”递推”找到祖先节点后，”回溯”的时候顺便将它的子孙节点都直接指向祖先，这样以后再次Find_Set(x)时复杂度就变成O(1)了，如下图所示。可见，路径压缩方便了以后的查找。

（2)合并时，按秩合并

即合并的时候将元素少的集合合并到元素多的集合中，这样合并之后树的高度会相对较小。

4、源代码

其代码如下，可点击这里进行下载：

从数据文件读取数据，然后构建完并查集后进行输出，保存在目标文件中，程序用到了boost library，所以运行程序需要添加相应库文件。

#include <fstream>
#include <iostream>
#include <string>

#include "boost/algorithm/string.hpp"
#include "boost/regex.hpp"

using namespace std;
using namespace boost;

const int MAX_ID = 200000001;

int father[MAX_ID];

void init_father()
{
    for (int i = 0; i < MAX_ID; i++)
    {
        father[i] = i;
    }
}

int find_ancestry(int id, int& count)
{
    count++;
    if(id != father[id])
    {
         father[id] = find_ancestry(father[id],count);
    }
    return father[id];
}

int process(string instr, int id_x, int id_y, fstream &out_file)
{
    //q
    if (instr == "")
    {
        return -1;
    }
    //不是同一祖先
    int deep_x = 0;
    int deep_y = 0;
    int temp_y = find_ancestry(id_y, deep_y);
    int temp_x = find_ancestry(id_x, deep_x);

    if (temp_x != temp_y)
    {
        //q
        if(instr == "Q")
        {
            out_file << "N\n";
        }
        else//p 进行合并
        {//优化2 ？ 优化1 ： 优化1
            deep_x >= deep_y ? father[temp_y] = temp_x
                             : father[temp_x] = temp_y; //y->x
        }
    }
    else
    {
        if(instr == "Q")
        {
            out_file << "Y\n";
        }
    }
    return 0;
}

fstream FOpen(string file_name, int read)
{
    if (file_name != "" && !file_name.empty() && (read == 0 || read == 1))
    {
        fstream File;
        if (read)
        {
            File.open(file_name);
        }
        else
            File.open(file_name, ios::out);
        
        if (!File)
        {
            cout << file_name << " can't not be opened." << endl;
            abort();
        }
        return File;
    }
    else
        cout << "can't get the file name." << endl;
}

int checkError(const string &line_temp)
{
    regex expr("(P|Q)(\\s\\d+){2}$");
    cmatch what;
    if(!regex_match(line_temp.c_str(), what, expr))
    {
        return -1;
    }
    return 0;
}

int read_line(fstream &cusu_file, string &line_temp)
{
    if (cusu_file.eof())
    {
        return -1;
    }
    else
    {
        getline(cusu_file, line_temp);
        return 0;
    }
}

int main()
{
    init_father();
    vector <string> pro_data;
    string temp_data = "";
    fstream in_file = FOpen("division.in", 1);
    fstream out_file = FOpen("division.out", 0);
    while(!read_line(in_file, temp_data))
    {
        if(!checkError(temp_data))
        {
            boost::algorithm::split(pro_data, temp_data, boost::algorithm::is_any_of(" "));
            if (!(pro_data[0] == "" || pro_data[1] == "" || pro_data[2] == ""))
            {
                process(pro_data[0], atoi(pro_data[1].c_str()), atoi(pro_data[2].c_str()), out_file);
            }
        }
        else
        {
            out_file << "SORRY\n";
            break;
        }
    }
    out_file.close();
    in_file.close();
    return 0;
}

数据文件：