CMU15445 2023project1详细过程（上）lru-k替换算法

丯是幡动

已于 2024-03-31 17:29:46 修改

阅读量3.1k

点赞数 26

分类专栏： CMU15445 文章标签： linux c++ 数据库

于 2024-03-29 14:22:31 首次发布

本文链接：https://blog.youkuaiyun.com/qq_40878302/article/details/137079257

版权

CMU15445 专栏收录该内容

11 篇文章

订阅专栏

Task1 LRU-K替换算法

1、知识了解

1.1LRU介绍：

LRU是内存满了，选择驱逐的页面时，选取最久没使用过的页面。这样的缺点是，比如一共五个位置，我们要做的操作是123123451222113123456，序号6的页面要把内存中的一个页面驱逐，可以看到123是经常需要用的页面，而45是偶然才用的。但使用LRU算法，把1逐出后，之后如果下一个要用1（很大概率），那就要驱逐2。为了防止出现这种情况，我们用LRU-K。

1.2LRU-K介绍

LRU-K中的K代表最近使用的次数，因此LRU可以认为是LRU-1。LRU-K的主要目的是为了解决LRU算法“缓存污染”的问题，其核心思想是将“最近使用过1次”的判断标准扩展为“最近使用过K次”
具体：维护两个队列，一个是历史记录，一个是缓存。页面来了先在历史记录队列存，访问次数达到K次后，把历史记录队列里的该页面转移到缓存队列中。
如果历史队列满了，删除规则自己定（LRU、先进先出等）；
如果缓存队列满了，找倒数第K次访问离现在最久的数据删掉。

2、代码详解

2.1 头文件（lru_k_replacer.h）

2.1.1各种变量名含义（添加了一些自己定义的）

变量名	含义
current_timestamp_	当前的时间戳,每进行一次record操作加一
curr_size_	当前存放的可驱逐页面数量
max_size	最多可驱逐页面数量
replacer_size_	整个主存大小（用于判断页是否非法越界）
k_	lru-k的k
lock_guard_	加锁标志（std::mutex）
k_time	页号对应的第k次的时间戳
timestamp	用于记录访问时间
time_frame_	页号与记录的访问时间的映射
recorded_cnt_	记录访问次数
evictable_	记录一个页面是否可以被驱逐
new_frame_	记录不满足k次访问页的页号(上面说的历史访问队列)
new_locate_	页号到历史访问队列的哈希表
cache_frame	到达k次页的链表的页号（上面说的缓存队列）
cache_locate_	页号到缓存队列的哈希表

2.1.2方法列表

方法名	作用
LRUKReplacer::LRUKReplacer(size_t num_frames, size_t k) :	初始化，定义越界范围和k值的
auto LRUKReplacer::Evict(frame_id_t *frame_id)	驱逐一个页面，并保存到frame_id中
void RecordAccess(frame_id_t frame_id);	增加一个页面的访问记录
void SetEvictable(frame_id_t frame_id, bool set_evictable);	设置一个页面是否可以被驱逐
void Remove(frame_id_t frame_id);	移除指定页面（仅在BufferPoolManager中删除页面时调用。）
auto Size() -> size_t;	返回可驱逐页面的大小
auto LRUKReplacer::CmpTimestamp	比较时间大小的

2.2Evict方法（驱除帧）

2.2.1思路
分两种情况，在历史队列里和在缓存队列里

如果在历史队列里
(1)页号和访问时间对应的容器内的东西要删掉。time_frame_[frame].clear();
(2)把历史访问列表相关的删掉(不要忘记记录访问次数的变量也要置为0)。recorded_cnt_[frame] = 0;new_locate_.erase(frame); new_frame_.remove(frame);
(3)统计量更改，可驱逐页面-1。 curr_size_–;
(4)找到的删除的页面要赋值给调用者提供的frame_id指针所指向的变量，以便调用者能够获取和使用这个标识符。*frame_id = frames;

如果在缓存队列里也是类似的，这里不多说了

注意！！ 这里frame_id是一个指针，所以他驱逐的不是给定的一个帧，而是遍历历史列表和缓存列表，去找到哪个帧可以被驱逐，找到后，frame_id就会指向那个帧。因为访问历史列表和缓存列表都是按照时间由旧到新排列的，所以每次遍历都是从前往后遍历（即最久未使用的先看能不能替换）

2.2.2 代码

// 驱逐帧
auto LRUKReplacer::Evict(frame_id_t *frame_id) -> bool {
  std::lock_guard<std::mutex> lock(lock_guard_);
  // 如果没有可以驱逐元素
  if (curr_size_ == 0) {
    return false;
  }
  // 看访问历史列表里，有无帧可以删除
  for (auto it = new_frame_.rbegin(); it != new_frame_.rend(); it++) {
    auto frame = *it;
    // 如果可以被删除
    if (evictable_[frame]) {
      recorded_cnt_[frame] = 0;
      new_locate_.erase(frame);
      new_frame_.remove(frame);
      curr_size_--;
      time_frame_[frame].clear();
      *frame_id = frame;
      return true;
    }
  }
  // 看缓存队列里有无帧可以删除
  for (auto its = cache_frame_.begin(); its != cache_frame_.end(); its++) {
    auto frames = (*its).first;
    if (evictable_[frames]) {
      recorded_cnt_[frames] = 0;
      cache_frame_.erase(its);
      cache_locate_.erase(frames);
      curr_size_--;
      time_frame_[frames].clear();
      *frame_id = frames;
      return true;
    }
  }
  return false;
}

2.3 RecordAccess方法（添加页面）

2.3.1思路
在这里插入图片描述
这里最下面的一行写反了！！，是旧————————>新

和删除一样也是分三种情况：
新加入的；
要从历史队列到缓存队列的；
已经在缓存队列里的
小知识：
std::upper_bound(ForwardIt first, ForwardIt last, const T& value，comp);

• first 和 last：定义了搜索范围的迭代器，即 [first, last)。
• value：要搜索的值或对象。
• comp：一个比较函数或可调用对象，用于比较搜索范围内的元素和 value。

std::upper_bound 返回一个迭代器，指向序列中第一个大于 value 的元素。如果序列中没有这样的元素，则返回 last。

2.3.2代码

// 访问逻辑：不满k次放访问历史列表。。。。
void LRUKReplacer::RecordAccess(frame_id_t frame_id, [[maybe_unused]] AccessType access_typ) {
  std::lock_guard<std::mutex> lock(lock_guard_);
  // 如果越界
  if (frame_id > static_cast<frame_id_t>(replacer_size_)) {
    throw std::exception();
  }
  current_timestamp_++;
  recorded_cnt_[frame_id]++;
  auto cnt = recorded_cnt_[frame_id];
  // 在访问时间列表尾部加新的时间戳，旧时间在前，新时间在后
  time_frame_[frame_id].push_back(current_timestamp_);
  // 如果是新加入的记录
  if (cnt == 1) {
    if (curr_size_ == max_size_) {
      frame_id_t frame;
      Evict(&frame);
    }
    evictable_[frame_id] = true;
    curr_size_++;
    // 添加新节点
    new_frame_.push_front(frame_id);
    // 该节点下维护链表
    new_locate_[frame_id] = new_frame_.begin();
  }
  // 如果记录达到k次，则需要从新队列中加入到老队列中
  if (cnt == k_) {
    new_frame_.erase(new_locate_[frame_id]);  // 从新队列中删除
    new_locate_.erase(frame_id);
    auto kth_time = time_frame_[frame_id].front();  // 获取当前页面的倒数第k次出现的时间
    k_time new_cache(frame_id, kth_time);
    auto it = std::upper_bound(cache_frame_.begin(),    	cache_frame_.end(), new_cache, CmpTimestamp);  
    // 找到该插入的位置
    it = cache_frame_.insert(it, new_cache);
    cache_locate_[frame_id] = it;
    return;
  }
  // 如果记录在k次以上，需要将该frame放到指定的位置
  if (cnt > k_) {
    time_frame_[frame_id].erase(time_frame_[frame_id].begin());
    // 去除原来的位置
    cache_frame_.erase(cache_locate_[frame_id]);
    // 获取当前页面的倒数第k次出现的时间
    auto kth_time = time_frame_[frame_id].front();
    k_time new_cache(frame_id, kth_time);
    // 找到该插入的位置
    auto it = std::upper_bound(cache_frame_.begin(), cache_frame_.end(), new_cache, CmpTimestamp);
    it = cache_frame_.insert(it, new_cache);
    cache_locate_[frame_id] = it;
    return;
  }
}

2.4 SetEvictable方法（设置是否可以被驱逐）

2.4.1思路
只有两种情况需要改，原本是不驱逐，要改为驱逐的；和原本驱逐，要改为不驱逐的。
小知识：++、–最好放前面，因为放前面是值先加，加完放寄存器里。放后面是先放寄存器，加完后再放一遍寄存器。

2.4.2 代码

// 页面设置为驱逐 or 不驱逐 set_evictable
void LRUKReplacer::SetEvictable(frame_id_t frame_id, bool set_evictable) {
  std::lock_guard<std::mutex> lock(lock_guard_);
  if (recorded_cnt_[frame_id] == 0) {
    return;
  }
  // true是保留,false是驱逐
  if (!evictable_[frame_id]) {
    // 原本不扔，要求扔
    if (set_evictable) {
      ++max_size_;
      ++curr_size_;
    }
  } else {
    if (!set_evictable) {
      // 原本扔，要求不扔
      --max_size_;
      --curr_size_;
    }
  }
  evictable_[frame_id] = set_evictable;
}

2.5Remove方法（移除指定页面）

2.5.1 思路
和evit的思路很像，区别是这个方法不返回值。逻辑都是相同的

2.5.2 代码

// 移除页面
void LRUKReplacer::Remove(frame_id_t frame_id) {
  std::lock_guard<std::mutex> lock(lock_guard_);
  if (frame_id > static_cast<frame_id_t>(replacer_size_)) {
    throw std::exception();
  }
  auto cnt = recorded_cnt_[frame_id];
  if (cnt == 0) {
    return;
  }
  if (!evictable_[frame_id]) {
    throw std::exception();
  }
  // 在访问历史列表里
  if (cnt < k_) {
    recorded_cnt_[frame_id] = 0;
    new_frame_.erase(new_locate_[frame_id]);
    new_locate_.erase(frame_id);
    --curr_size_;
    time_frame_[frame_id].clear();
  } else {  // 在缓存队列里
    recorded_cnt_[frame_id] = 0;
    cache_frame_.erase(cache_locate_[frame_id]);
    cache_locate_.erase(frame_id);
    --curr_size_;
    time_frame_[frame_id].clear();
  }
}

Size()方法返回可驱逐页面的大小就可以，不写了

下面是比较大小的CmpTimestamp方法

auto LRUKReplacer::CmpTimestamp(const LRUKReplacer::k_time &f1, const LRUKReplacer::k_time &f2) -> bool {
  return f1.second < f2.second;
}

随便说点：这个代码整体是我从网上找的现成的，稍微改了点但只有非常非常少，直接就是一个面向结果编程。c++中间学了半个月结果还是毛都不会，笑死，进步了但不多。后面的项目代码网上越来越少看着令人绝望，我当时做这个一定是脑袋被踢了，救命！！有没有佬！救救我！！

参考文章
[1]https://blog.youkuaiyun.com/zhanglong_4444/article/details/88344953(LRU . LFU 和 LRU-K 的解释与区别)
[2]https://blog.youkuaiyun.com/AntiO2/article/details/128439155?spm=1001.2014.3001.5506（缓存替换策略：LRU-K算法详解及其C++实现 CMU15-445 Project#1）
[3]https://blog.youkuaiyun.com/albertsh/article/details/106976688(C++中的std::lower_bound()和std::upper_bound()函数)