(done) 梳理 xv6-lab-2023 LAB8 实验代码（kalloctest，理清 test1）_xv6riscv实验usertest报错exit参数-优快云博客

本文链接：https://blog.youkuaiyun.com/shimly123456/article/details/146170425

url: https://pdos.csail.mit.edu/6.1810/2023/labs/lock.html

先看 kalloctest.c main 函数：

int
main(int argc, char *argv[])
{
  test1();
  test2();
  test3();
  exit(0);
}

运行 kalloctest，如下：
在这里插入图片描述

可以看到只有 test1 失败了，因此只需要关注 test1

此外，还需关注红圈圈起来的数字代表什么。

先来看 kalloctest.c : test1 源码：

// Test concurrent kallocs and kfrees
void test1(void)
{
  void *a, *a1;
  int n, m;
  printf("start test1\n");  
  m = ntas(0);
  for(int i = 0; i < NCHILD; i++){
    int pid = fork();
    if(pid < 0){
      printf("fork failed");
      exit(-1);
    }
    if(pid == 0){
      for(i = 0; i < N; i++) {
        a = sbrk(4096);
        *(int *)(a+4) = 1;
        a1 = sbrk(-4096);
        if (a1 != a + 4096) {
          printf("wrong sbrk\n");
          exit(-1);
        }
      }
      exit(-1);
    }
  }

  for(int i = 0; i < NCHILD; i++){
    wait(0);
  }
  printf("test1 results:\n");
  n = ntas(1);
  if(n-m < 10) 
    printf("test1 OK\n");
  else
    printf("test1 FAIL\n");
}

从 test1 源码来看，实际上是比较 ntas() 这个函数在大量 sbrk(+) 和 sbrk(-) 之后的返回值差异。
然后返回值差异 < 10，那么 test1，通过，如果返回值差异 >= 10，那么 test1 失败。

那么接下来可以划分任务：
任务1：nats() 函数的返回值是？(完成)
任务2：statistics() 函数会往 nats() 的 buf 参数中填入什么？(完成)
任务3：猜测 “statistics” 是设备，梳理它的注册过程 (完成)

根据任务1, 2, 3 的解释，可知，nats() 函数是在统计 xv6 系统从启动到执行 nats() 时，kmem 和 bcache 的锁尝试获取但失败的次数。所以，test1() 要求我们达到的目标是：经过锁优化和并行优化后，“kmem 和 bcache 的锁尝试获取但失败的次数” < 10

任务1：nats() 函数的返回值是？(完成)

先来看 nats() 的源码：

int ntas(int print)
{
  int n;
  char *c;

  if (statistics(buf, SZ) <= 0) {
    fprintf(2, "ntas: no stats\n");
  }
  c = strchr(buf, '=');
  n = atoi(c+2);
  if(print)
    printf("%s", buf);
  return n;
}

从代码来看，是先调用 statistics(buf, SZ) 函数，该函数会往 buf 数组里填入文本，该文本就是下图的文本
(如果 print 参数为 1，那么就会打印如下图文本)
在这里插入图片描述

变量 n 的值就是 tot= 后面的数组，在上图是 19134

所以，可回答：nats() 函数的返回值就是 “tot= ” 后面的数字，如上图是 19134

那么下一个问题，statistics() 函数会往 nats() 的 buf 参数中填入什么？

任务2：statistics() 函数会往 nats() 的 buf 参数中填入什么？(完成)

看 user/statistics.c 源码：

#include "kernel/types.h"
#include "kernel/stat.h"
#include "kernel/fcntl.h"
#include "user/user.h"

int
statistics(void *buf, int sz)
{
  int fd, i, n;
  
  fd = open("statistics", O_RDONLY);
  if(fd < 0) {
      fprintf(2, "stats: open failed\n");
      exit(1);
  }
  for (i = 0; i < sz; ) {
    if ((n = read(fd, buf+i, sz-i)) < 0) {
      break;
    }
    i += n;
  }
  close(fd);
  return i;
}

回答：从源码来看，statistics 函数是在从文件 “statistics” 读取大小为 sz 字节数的内容，存入到 buf 指针指向的缓冲区中。

任务3：猜测 “statistics” 是设备，梳理它的注册过程 (完成)

user/init.c 源码中，可以看到使用 mknod 注册了设备 “statistics”，设备号是 STATS
在这里插入图片描述

在 kernel/main.c 中，可以看到 STATS 设备的初始化
在这里插入图片描述

跟进去，可以看到注册了设备 STATS 的锁和读写函数。
在这里插入图片描述

我们之前看到了 statistics() 函数在读 “statistics” 设备，所以我们跳进 statsread 看看

int
statsread(int user_dst, uint64 dst, int n)
{
  int m;

  acquire(&stats.lock);

  if(stats.sz == 0) {
#ifdef LAB_PGTBL
    stats.sz = statscopyin(stats.buf, BUFSZ);
#endif
#ifdef LAB_LOCK
    stats.sz = statslock(stats.buf, BUFSZ);
#endif
  }
  m = stats.sz - stats.off;

  if (m > 0) {
    if(m > n)
      m  = n;
    if(either_copyout(user_dst, dst, stats.buf+stats.off, m) != -1) {
      stats.off += m;
    }
  } else {
    m = -1;
    stats.sz = 0;
    stats.off = 0;
  }
  release(&stats.lock);
  return m;
}

这里的代码看来，是先调用 statslock 函数往 stats.buf 写入数据，再调用 either_copyout 把内存内容拷贝到用户内存。
进入 statslock 查看

int
snprint_lock(char *buf, int sz, struct spinlock *lk)
{
  int n = 0;
  if(lk->n > 0) {
    n = snprintf(buf, sz, "lock: %s: #test-and-set %d #acquire() %d\n",
                 lk->name, lk->nts, lk->n);
  }
  return n;
}

int
statslock(char *buf, int sz) {
  int n;
  int tot = 0;

  acquire(&lock_locks);
  n = snprintf(buf, sz, "--- lock kmem/bcache stats\n");
  for(int i = 0; i < NLOCK; i++) {
    if(locks[i] == 0)
      break;
    if(strncmp(locks[i]->name, "bcache", strlen("bcache")) == 0 ||
       strncmp(locks[i]->name, "kmem", strlen("kmem")) == 0) {
      tot += locks[i]->nts;
      n += snprint_lock(buf +n, sz-n, locks[i]);
    }
  }
  
  n += snprintf(buf+n, sz-n, "--- top 5 contended locks:\n");
  int last = 100000000;
  // stupid way to compute top 5 contended locks
  for(int t = 0; t < 5; t++) {
    int top = 0;
    for(int i = 0; i < NLOCK; i++) {
      if(locks[i] == 0)
        break;
      if(locks[i]->nts > locks[top]->nts && locks[i]->nts < last) {
        top = i;
      }
    }
    n += snprint_lock(buf+n, sz-n, locks[top]);
    last = locks[top]->nts;
  }
  n += snprintf(buf+n, sz-n, "tot= %d\n", tot);
  release(&lock_locks);  
  return n;
}

可以看到这里的代码内容和我们之间看到的
在这里插入图片描述

颇有关联。

根据 statslock 和 sprint_lock 源码来看，上述红圈的第一个数字是 lk->nts 的值，第二个数值是 lk->n 的值

tot= 后续跟的数字是 kmem 和 bcache 两个锁的 lk->nts 的值的总和

进入 acquire 函数源码看看：

// Acquire the lock.
// Loops (spins) until the lock is acquired.
void
acquire(struct spinlock *lk)
{
  push_off(); // disable interrupts to avoid deadlock.
  if(holding(lk))
    panic("acquire");

#ifdef LAB_LOCK
    __sync_fetch_and_add(&(lk->n), 1);
#endif      

  // On RISC-V, sync_lock_test_and_set turns into an atomic swap:
  //   a5 = 1
  //   s1 = &lk->locked
  //   amoswap.w.aq a5, a5, (s1)
  // 如果还没获取锁 lk->locked，那么 lk->nts 的值 + 1
  while(__sync_lock_test_and_set(&lk->locked, 1) != 0) {
#ifdef LAB_LOCK
    __sync_fetch_and_add(&(lk->nts), 1);
#else
   ;
#endif
  }

  // Tell the C compiler and the processor to not move loads or stores
  // past this point, to ensure that the critical section's memory
  // references happen strictly after the lock is acquired.
  // On RISC-V, this emits a fence instruction.
  __sync_synchronize();

  // Record info about lock acquisition for holding() and debugging.
  lk->cpu = mycpu();
}