Ruby Benchmark的实现

最新推荐文章于 2024-08-28 08:30:27 发布

iteye_12339

最新推荐文章于 2024-08-28 08:30:27 发布

阅读量328

点赞数

分类专栏：我喜欢的语言文章标签： Ruby 算法 Linux F#

我喜欢的语言专栏收录该内容

82 篇文章

订阅专栏

本文深入剖析了 Ruby 中 Benchmark 模块的工作原理，详细解释了如何使用 bm 方法进行性能测试，介绍了其内部实现机制，包括时间测量的具体算法以及与 Process::times 的交互。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

以我们常用的benchmark方法bm()为例，

请看下面这个例子：

    Benchmark.bm(20) do |x|
      x.report("getBWIL"){1000.times{result.getBindedWebIdList(0, '22000005')}}
      x.report("getBWIL"){5000.times{result.getBindedWebIdList(0, '22000005')}}
      x.report("getBWIL"){10000.times{result.getBindedWebIdList(0, '22000005')}}
    end

  def bm(label_width = 0, *labels, &blk) # :yield: report
    benchmark(" "*label_width + CAPTION, label_width, FMTSTR, *labels, &blk)
  end

可以看到，源码里它是调用了benchmark方法，跳到benchmark

  def benchmark(caption = "", label_width = nil, fmtstr = nil, *labels) # :yield: report
    sync = STDOUT.sync
    STDOUT.sync = true
    label_width ||= 0
    fmtstr ||= FMTSTR
    raise ArgumentError, "no block" unless iterator?
    print caption
    results = yield(Report.new(label_width, fmtstr))
    Array === results and results.grep(Tms).each {|t|
      print((labels.shift || t.label || "").ljust(label_width), 
            t.format(fmtstr))
    }
    STDOUT.sync = sync
  end

这里的代码我们只看yield，这就是例子中的那个传递给bm的block的x，是个Report对象。

  class Report # :nodoc:
    #
    # Returns an initialized Report instance.
    # Usually, one doesn't call this method directly, as new
    # Report objects are created by the #benchmark and #bm methods. 
    # _width_ and _fmtstr_ are the label offset and 
    # format string used by Tms#format. 
    # 
    def initialize(width = 0, fmtstr = nil)
      @width, @fmtstr = width, fmtstr
    end

    #
    # Prints the _label_ and measured time for the block,
    # formatted by _fmt_. See Tms#format for the
    # formatting rules.
    #
    def item(label = "", *fmt, &blk) # :yield:
      print label.ljust(@width)
      res = Benchmark::measure(&blk)
      print res.format(@fmtstr, *fmt)
      res
    end

    alias report item
  end

进入Report后发现，x.report其实就是Report.item，item方法中关键的实现是Benchmark::measure，继续跟进

  def measure(label = "") # :yield:
    t0, r0 = Benchmark.times, Time.now
    yield
    t1, r1 = Benchmark.times, Time.now
    Benchmark::Tms.new(t1.utime  - t0.utime, 
                       t1.stime  - t0.stime, 
                       t1.cutime - t0.cutime, 
                       t1.cstime - t0.cstime, 
                       r1.to_f - r0.to_f,
                       label)
  end

这里看到benchmark中真正负责计算执行时间的算法，很容易理解，但有个关键的类型要知道，Benchmark.times，我们继续走进看看这个东东是什么。

插一句，我们上面看到的measure方法前面并没有加self关键字来定义，为何就变成了模块方法可以Benchmark::measure直接执行呢，原因在benchmark.rb文件第353行：

module_function :benchmark, :measure, :realtime, :bm, :bmbm

module_function将模块内方法变成了模块方法（java的静态方法），可以通过模块名直接引用的。

  def Benchmark::times() # :nodoc:
      Process::times()
  end

Benchmark.times调用了Process::times()，继续走进

  #     Process.times   => aStructTms
  #
  #
  # Returns a <code>Tms</code> structure (see <code>Struct::Tms</code>
  # on page 388) that contains user and system CPU times for this
  # process.
  #
  #    t = Process.times
  #    [ t.utime, t.stime ]   #=> [0.0, 0.02]
  #
  #
  def self.times
    # This is just a stub for a builtin Ruby method.
    # See the top of this file for more info.
  end

我们跳进了另一个文件stub_process.rb，看到这个东东的E文说明，是builtin方法，native code不下ruby的源文件是看不到源码的，看说明我们知道这个方法返回Tms结构体。

tms这个结构体在ruby中没有说明，但可以在Linux的内核文件中找到定义，
请参看<sys/times.h>

struct tms {
     clock_t  tms_utime;  /* user CPU time */
     clock_t  tms_stime;  /* system CPU time */
     clock_t  tms_cutime; /* user CPU time, terminated children */
     clock_t  tms_cstime; /* system CPU time, terminated children */
   };

该结构体是针对当前进程而得到的结果，
tms_utime是用户态时间，我们写的ruby的代码执行时间应该算在这里，
tms_stime是系统态时间，应该是启动或者销毁ruby解析器所花费的时间，还有其它和系统相关的任务所花费的时间。
剩下的两个children的我就不知道了，是不是该进程所起动的子进程的当前用户态和系统态时间？可能。

benchmark的计时算法就是把这四个时间相减，得到block的执行时间，再用Time.now相减，得到real time.

ruby中通过以下方法访问tms结构体：

utime           # user time
stime           # system time
cutime          # user time of children
cstime          # system time of children