CMU 15445 Projec0 2022

最新推荐文章于 2025-05-29 22:11:44 发布

原创最新推荐文章于 2025-05-29 22:11:44 发布 · 3k 阅读

10 ·

CC 4.0 BY-SA版权

文章标签：

#数据库 #c++ #数据结构

CMU 15445 Exp 专栏收录该内容

1 篇文章

订阅专栏

前言

CMU 15445是CMU针对数据库内核的一门硬核课程，区别于传统国内教学数据库教授SQL 增删查改。对于进一步理解并能开发数据库有极强的帮助作用。
所需知识

C++ 11
算法与数据结构
your patience and enthusiasm

实验简介

官方指导网页

https://15445.courses.cs.cmu.edu/fall2022/project0/

实验中采用的数据结构是字典树，一种经典的KV存储的数据结构
在这里插入图片描述

实验中用到了三个核心class

TrieNode a single node in a Trie. A TrieNode holds one character of a key and the is_end_ flag indicates whether it marks the end of a key string.
TrieNodeWithValue TrieNodeWithValue class inherits from TrieNode and represents a terminal node (node whose key_char is the ending character of a key) that can hold values of arbitrary type T. Its is_end_ flag is always true.
Trie Class
整个存储字典树包含插入删除等功能

TrieNode

填补相对应的函数block

构造函数

在这里插入图片描述
什么是C++中的显式转换和隐式转换呢？以及为什么要在代码中指定只能显式转换

常见的在C++中还有reinterpret_cast这一显式转换，用于执行底层二进制的重新解释转换。它可以将一个指针或引用转换为不同类型的指针或引用，或者将整数类型转换为指针类型，反之亦然。
在这里插入图片描述

移动构造函数1

在这里插入图片描述
实现转换功能有两种方式分别是 swap和move函数，二者之间的区别？

1：在这种情况下，this->children_ = std::move(other_trie_node.children_) 和 this->children_.swap(other_trie_node.children_) 在运行效果上是相同的，因为它们都将other_trie_node的children_移动到了当前对象的children_中。

std::move将 other_trie_node.children_ 转换为右值引用，然后通过赋值运算符将其移动到this->children_。这将使this->children_拥有other_trie_node之前的数据，并且other_trie_node的children_会被置为空状态。

2：另一方面，this->children_.swap(other_trie_node.children_)通过调用std::unordered_map的swap函数交换了this->children_ 和 other_trie_node的children_。这将导致this->children_指向other_trie_node之前的数据，而other_trie_node的children_指向this->children_之前的数据。

总结：在这个特定的情况下，两种方式的运行效果是一样的，它们都将other_trie_node的children_移动到了当前对象的children_中。然而，值得注意的是，两种方式的实现方式略有不同，可能会涉及不同的底层操作，因此在具体使用时需要根据具体情况选择适合的方式。

更好理解move函数，区别于Rust所有权转换

std::move 并不直接进行数据的移动操作，它只是将对象标记为可移动的右值引用。这样做的目的是告诉编译器，我们希望使用移动语义来处理该对象，而不是复制语义。
当使用std::move标记对象时，它仅仅是改变了对象的类型，将左值引用转换为右值引用。这样做的结果是，编译器可以选择使用移动语义而不是复制语义来处理对象。
在上述代码中，this->children_ = std::move(other_trie_node.children_)将other_trie_node.children_转换为右值引用，并将其赋值给this->children_。这样做的好处是，编译器可以选择移动语义来处理对象的数据，从而提高效率。
请注意，std::move本身并不执行实际的数据移动，它只是提供了一种机制，告诉编译器我们希望使用移动语义。实际的数据移动操作发生在赋值运算符的实现中，具体如何处理数据移动取决于对象的类型和实现。

为什么出现了&& 是右值了，但是不能直接等于还需要进行std::move函数呢？
在这里插入图片描述
看到了&& （右值引用的标志是&&）,我们会第一时间认为是右值，但是实际上并不是，一般如果作为函数返回值的 && 是右值，直接声明出来的 && 是左值。所以这里的other_trie_node在使用的时候仍然是左值

简单函数（无强调之处）

在这里插入图片描述

InsertChildNode 函数

要求：
在这里插入图片描述
代码如下

我们返回的是一个指针指向unique_ptr，这样我们就可以在不转换所有权的情况下，通过访问指针来进行数据的访问
move函数和forward函数

TrieNodeWithValue

在这里插入图片描述
继承from TrieNode ，采用泛型的原因是，当进入结尾是，key指向的数据可以是任意类型，如图所示

TrieNodeWithValue(TrieNode &&trieNode, T value)

在这里插入图片描述
在C++11及以后的版本中，当对象以右值引用（&&）的形式传递给构造函数时，编译器会自动调用移动构造函数（如果存在），而不是拷贝构造函数

还有一种写法采用完美转发函数

如何理解移动构造函数和拷贝构造函数
在C++11及以后的版本中，当对象以右值引用（&&）的形式传递给构造函数时，编译器会自动调用移动构造函数（如果存在），而不是拷贝构造函数。

拷贝构造函数
拷贝构造函数用于创建一个新对象，该对象与给定对象具有相同的属性和值。拷贝构造函数的形式通常为类名(const 类名& other)，其中other是同类的另一个对象。它会深度复制other对象的内容，并创建一个新的对象。
移动构造函数
移动构造函数用于创建一个新对象，并从给定对象“移动”（而不是复制）资源，例如动态分配的内存或资源所有权。移动构造函数的形式通常为类名(类名&& other)，其中other是同类的右值引用。移动构造函数的目的是通过转移资源的所有权来提高性能和效率。

Tire

Trie is a concurrent key-value store. Each key is a string and its corresponding value can be any type.
构造函数
在这里插入图片描述
reset设置指针的重定向

Insert函数

要求:
1：如果字符串是空，立即返回false
2：如果该string已经存在，说明不能插入，不可以修改已经存储的value
3：如果该字符串不存在，插入，并调用构造函数，将终端节点修改成TrieNodeWithValue类型
4：如果存在，并终端节点TrieNode，转换成TrieNodeWithValue
5：如果存在，并终端节点TrieNodeWithValue，返回错误，即第二点
在这里插入图片描述

        template <typename T>
        bool Insert(const std::string &key, T value) {
            if (key.empty()) return false;
            latch_.WLock(); //插入时，写锁lock
            auto node = &root_;
            if(key.size()>1) {
                for (auto ch : key.substr(0, key.size() - 1)) {
                    auto child = node->get()->GetChildNode(ch);
                    if (child == nullptr)
                        child = node->get()->InsertChildNode(ch, std::make_unique<TrieNode>(ch));
                    //std::unique_ptr<TrieNode> *InsertChildNode(char key_char, std::unique_ptr<TrieNode> &&child)
                    node = child;
                }
            }
            //到倒数第二个字符
            bool ret=false;
            char ch = key[key.size() - 1];
            auto terminal_node = node->get()->GetChildNode(ch); //判断最后一个是不是倒数第二个的子节点
            if (terminal_node == nullptr)  //
            {
                node->get()->InsertChildNode(ch, std::make_unique<TrieNodeWithValue<T>>(ch, value));
                ret=true;
            } else if (!terminal_node->get()->IsEndNode())  // 不是叶子节点
            {
                auto new_node_terminal = new TrieNodeWithValue<T>(std::move(*(terminal_node->get())), value);
                // 创建一个新的TrieNodeWithValue<T>类型的对象，
                // 并将terminal_node指向的对象的值转移到该对象的构造函数中，同时传递value作为另一个参数。
                // 如果相应节点存在但不是终结节点（通过 is_end_ 判断）
                // 将其转化为 TrieNodeWithValue 并把值赋给该节点
                // 该操作不破坏以该键为前缀的后续其它节点（children_ 不变），插入成功
                terminal_node->reset(new_node_terminal);
                ret=true;
            } else {  // 如果相应节点存在且是终结节点，说明该键在 Trie 树存在，规定不能覆盖已存在的值，返回插入失败
                ret= false;
            }
            latch_.WUnlock();
            return ret;
        }

remove()

// 如果其 parent 节点在移除该子节点后没有其它子节点，也删除
        bool Remove(const std::string &key) {
            if (key.empty()) return false;
            latch_.WLock();
            std::unique_ptr<TrieNode> *node = &root_;
            std::vector<std::unique_ptr<TrieNode> *> traversal_path;
            for (auto ch : key) {
                traversal_path.emplace_back(node);
                auto child = node->get()->GetChildNode(ch);
                if (child == nullptr) {  // 没有该映射
                    latch_.WUnlock();
                    return false;
                }
                node = child;
            }
            if (node->get()->HasChildren())  // 没有到达叶子节点位置 说明她是某个字符串的前缀
                node->get()->SetEndNode(false);
            else {  // 回溯删除相应的映射
                for (int i = traversal_path.size() - 1; i >= 0; --i) {
                    auto pre = traversal_path[i];
                    if (node->get()->IsEndNode() || node->get()->HasChildren())
                        // 如果是，则表示当前 Trie 树节点不应该被删除，因为它可能是其他单词的前缀节点，
                        //  或者它的子节点可能是其他单词的结束节点。
                        //  所以在回溯删除相应的映射时，如果当前 Trie 树节点满足这两个条件之一，会跳出循环，
                        //  不执行删除操作，以保留这个节点。
                        break;  // if the key exists and is removed
                    pre->get()->RemoveChildNode(key[i]);
                    node = pre;
                }
            }
            latch_.WUnlock();
            return true;
        }

getvalue()

在这里插入图片描述

/*沿 Trie 树查找，如果键不存在，或者节点中存储的值类型与函数调用的类型 T 不一致，
         * 将 *success 标识设为 false。类型判断的方式是使用 dynamic_cast。
         */
        template <typename T>

        T GetValue(const std::string &key, bool *success) {
            *success = false;
            latch_.RLock();

            auto pre_child = &root_;
            auto c = key.begin();
            while (c != key.end()) {
                auto cur = c++;
                auto next_node = pre_child->get()->GetChildNode(*cur);

                if (!next_node) {
                    *success = false;
                    break;
                }

                if (next_node->get()->IsEndNode() && c == key.end()) {
                    auto flag_node = dynamic_cast<TrieNodeWithValue<T> *>(next_node->get());
                    if (!flag_node) {
                        *success = false;
                        break;
                    }
                    *success = true;
                    latch_.RUnlock();
                    return flag_node->GetValue();
                }
                pre_child = next_node;
            }
            latch_.RUnlock();
            return {};
        }
    };

在这里插入图片描述