memoize in Ruby

本文介绍了《OnLisp》一书中关于函数调用返回值缓存的技术,并展示了Common Lisp与Ruby中实现该技术的方法。通过具体示例说明了如何利用这种技术提高程序效率。

Common Lisp 的经典书《On Lisp》的 5.3 节叫做 Memoizing 。书中讲到了将函数调用的返回值缓存起来的一种技术。这本来是一种非常常见的技术,但是《On Lisp》让我看到了动态语言的精练之处,这样的一种技术被抽象成一个通用的函数,将任意一个函数传入 memoize ,就会得到一个经过包装的函数,并且它已经具备了缓存的能力:

1
2
3
4
5
6
7
8
(defun memoize (fn)
  (let ((cache (make-hash-table :test #'equal)))
    #'(lambda (&rest args)
  (multiple-value-bind (val win) (gethash args cache)
    (if win
        val
        (setf (gethash args cache)
        (apply fn args)))))))

下面是一个例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
CL-USER> (setf (symbol-function 'memoized)
         (memoize #'(lambda (x)
          (sleep 5)
          x)))
#<CLOSURE (LAMBDA (&REST ARGS)) {B7B80F5}>
CL-USER> (time (memoized 1))
Evaluation took:
  5.0 seconds of real time
  0.004 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
1
CL-USER> (time (memoized 1))
Evaluation took:
  0.0 seconds of real time
  0.0 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
1

《On Lisp》是一本非常经典的书,如果你不明白为什么 C 语言的宏与 Lisp 的宏一个是青蛙一个是王子的话,极力建议读一读,里面讲到了许多很实用的技巧,犹如秘境寻宝般刺激!

今天在 Nextlib 的公共图书馆里面淘到这篇 Ruby Monitor-Functions ,里面讲到了 Ruby 里的 wrap_method ,我立即就想到了原来看过的 memoize 函数。

实现本身是非常简单的,因为和在 Common Lisp 里面一样,Ruby 里的哈希表用起来也是非常方便的。在 Lisp 里的 memoize 函数采用函数编程的套路,没有副作用,而是返回一个包装过的函数,而 Ruby 的语法更接近命令式,我希望能以如下方式来使用这个 memoize 函数:

class Foo
  def foo
    # do some time-consuming work
  end
  memoize :foo
end

换言之,调用 memoize 之后, 作为副作用,foo 就有了缓存的能力,而不用关心 memoize 方法的返回值。我将这个方法实现为 Module 的类方法,这样就可以在定义类的时候使用它了。

1
2
3
4
5
6
7
8
9
10
11
12
class Module
  def memoize method_name
    memo ||= { }
    orig_method = instance_method(method_name)
    define_method method_name do |*args|
      memo.fetch(args) do |key|
        val = orig_method.bind(self).call(*args)
        memo[key] = val
      end
    end
  end
end

这是一个非常简易的实现,没有用到那篇文章中实现的 wrap_method 方法。 memoize 也没有考虑有 block 的情况,因为 memoize 本身就不是万金油,不能滥用啊。另外,由于 memo 变量是为每一个方法调用的时候创建的,所以同一个类的不同的对象用同样的参数来调用的话,会取到同样的 cache ,想要区分不同的对象也简单,在存入 memo 的时候哈希表的键值(就是那个 args 数组)前面把 self 加进去就可以了,像这样

5
6
7
8
9
10
    define_method method_name do |*args|
      memo.fetch(Array.new(args).unshift(self)) do |key|
        val = orig_method.bind(self).call(*args)
        memo[key] = val
      end
    end

下面来测试一下效果:

1
2
3
4
5
6
7
8
9
10
11
require 'benchmark'
 
array = (1..100000).map { rand 100 }
computer = Computer.new
 
Benchmark.bm(20) do |x|
  x.report("normal factor:") { array.each { |i| computer.factor(i) } }
  Computer.instance_eval { memoize :factor }
  x.report("memoize installed:") { array.each { |i| computer.factor(i) } }
  x.report("all cached:") { array.each { |i| computer.factor(i) } }
end

在我的机器 (Debian GNU/Linux 2.6.22-1-686) 上跑出来的结果是:

                          user     system      total        real
normal factor:       17.060000   1.350000  18.410000 ( 18.884428)
memoize installed:    0.750000   0.120000   0.870000 (  0.895277)
all cached:           0.740000   0.120000   0.860000 (  0.904223)

可以看到 memoize 的效果非常明显,第一个和后面两个时间差距非常大。而第二个和第三个,一个是现跑现 cache ,另外一个是所有的都是从 cache 中取的,基本上就是查找哈希表的过程,不过它们之间差距并不大。

完整的代码可以从这里下载。

==> /var/log/gitlab/gitlab-rails/exceptions_json.log <== {"severity":"ERROR","time":"2025-08-26T10:57:39.018Z","correlation_id":"01K3K0D5C5TW7GMHC7R9Y9A5YC","exception.class":"Gitlab::Git::CommandTimedOut","exception.message":"4:Deadline Exceeded.","exception.backtrace":["lib/gitlab/git/wraps_gitaly_errors.rb:13:in `rescue in wrapped_gitaly_errors'","lib/gitlab/git/wraps_gitaly_errors.rb:6:in `wrapped_gitaly_errors'","lib/gitlab/git/repository.rb:1151:in `uncached_has_local_branches?'","lib/gitlab/git/repository.rb:192:in `block in has_local_branches?'","lib/gitlab/utils/strong_memoize.rb:44:in `strong_memoize'","lib/gitlab/git/repository.rb:191:in `has_local_branches?'","app/models/repository.rb:598:in `has_visible_content?'","lib/gitlab/repository_cache_adapter.rb:95:in `block (2 levels) in cache_method_asymmetrically'","lib/gitlab/repository_cache.rb:44:in `fetch_without_caching_false'","lib/gitlab/repository_cache_adapter.rb:190:in `block (2 levels) in cache_method_output_asymmetrically'","lib/gitlab/safe_request_store.rb:12:in `fetch'","lib/gitlab/repository_cache.rb:25:in `fetch'","lib/gitlab/repository_cache_adapter.rb:189:in `block in cache_method_output_asymmetrically'","lib/gitlab/utils/strong_memoize.rb:44:in `strong_memoize'","lib/gitlab/repository_cache_adapter.rb:203:in `block in memoize_method_output'","lib/gitlab/repository_cache_adapter.rb:212:in `no_repository_fallback'","lib/gitlab/repository_cache_adapter.rb:202:in `memoize_method_output'","lib/gitlab/repository_cache_adapter.rb:188:in `cache_method_output_asymmetrically'","lib/gitlab/repository_cache_adapter.rb:94:in `block in cache_method_asymmetrically'","app/models/repository.rb:572:in `empty?'","app/models/concerns/has_repository.rb:71:in `empty_repo?'","app/models/protected_branch.rb:33:in `protected?'","lib/gitlab/checks/branch_check.rb:52:in `block in protected_branch_checks'","lib/gitlab/checks/timed_logger.rb:27:in `log_timed'","lib/gitlab/checks/branch_check.rb:51:in `protected_branch_checks'","lib/gitlab/checks/branch_check.rb:37:in `validate!'","lib/gitlab/checks/single_change_access.rb:45:in `ref_level_checks'","lib/gitlab/checks/single_change_access.rb:29:in `validate!'","lib/gitlab/checks/changes_access.rb:114:in `block in single_access_checks!'","lib/gitlab/checks/changes_access.rb:113:in `each'","lib/gitlab/checks/changes_access.rb:113:in `single_access_checks!'","lib/gitlab/checks/changes_access.rb:25:in `validate!'","lib/gitlab/git_access.rb:360:in `check_access!'","lib/gitlab/git_access.rb:349:in `check_change_access!'","lib/gitlab/git_access.rb:332:in `check_push_access!'","lib/gitlab/git_access.rb:92:in `check'","lib/api/internal/base.rb:121:in `access_check!'","lib/api/internal/base.rb:67:in `block in check_allowed'","lib/api/internal/base.rb:145:in `with_admin_mode_bypass!'","lib/api/internal/base.rb:66:in `check_allowed'","lib/api/internal/base.rb:169:in `block (2 levels) in \u003cclass:Base\u003e'","lib/api/api_guard.rb:215:in `call'","lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call'","lib/gitlab/middleware/memory_report.rb:13:in `call'","lib/gitlab/middleware/speedscope.rb:13:in `call'","lib/gitlab/database/load_balancing/rack_middleware.rb:23:in `call'","lib/gitlab/middleware/rails_queue_duration.rb:33:in `call'","lib/gitlab/metrics/rack_middleware.rb:16:in `block in call'","lib/gitlab/metrics/web_transaction.rb:46:in `run'","lib/gitlab/metrics/rack_middleware.rb:16:in `call'","lib/gitlab/jira/middleware.rb:19:in `call'","lib/gitlab/middleware/go.rb:20:in `call'","lib/gitlab/etag_caching/middleware.rb:21:in `call'","lib/gitlab/middleware/query_analyzer.rb:11:in `block in call'","lib/gitlab/database/query_analyzer.rb:37:in `within'","lib/gitlab/middleware/query_analyzer.rb:11:in `call'","lib/gitlab/middleware/multipart.rb:173:in `call'","lib/gitlab/middleware/read_only/controller.rb:50:in `call'","lib/gitlab/middleware/read_only.rb:18:in `call'","lib/gitlab/middleware/same_site_cookies.rb:27:in `call'","lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call'","lib/gitlab/middleware/basic_health_check.rb:25:in `call'","lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call'","lib/gitlab/middleware/request_context.rb:21:in `call'","lib/gitlab/middleware/webhook_recursion_detection.rb:15:in `call'","config/initializers/fix_local_cache_middleware.rb:11:in `call'","lib/gitlab/middleware/compressed_json.rb:26:in `call'","lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call'","lib/gitlab/middleware/sidekiq_web_static.rb:20:in `call'","lib/gitlab/metrics/requests_rack_middleware.rb:77:in `call'","lib/gitlab/middleware/release_env.rb:13:in `call'"],"user.username":null,"tags.program":"web","tags.locale":"en","tags.feature_category":"source_code_management","tags.correlation_id":"01K3K0D5C5TW7GMHC7R9Y9A5YC"} ==> /var/log/gitlab/gitlab-rails/production.log <==
08-29
标题基于Python的汽车之家网站舆情分析系统研究AI更换标题第1章引言阐述汽车之家网站舆情分析的研究背景、意义、国内外研究现状、论文方法及创新点。1.1研究背景与意义说明汽车之家网站舆情分析对汽车行业及消费者的重要性。1.2国内外研究现状概述国内外在汽车舆情分析领域的研究进展与成果。1.3论文方法及创新点介绍本文采用的研究方法及相较于前人的创新之处。第2章相关理论总结和评述舆情分析、Python编程及网络爬虫相关理论。2.1舆情分析理论阐述舆情分析的基本概念、流程及关键技术。2.2Python编程基础介绍Python语言特点及其在数据分析中的应用。2.3网络爬虫技术说明网络爬虫的原理及在舆情数据收集中的应用。第3章系统设计详细描述基于Python的汽车之家网站舆情分析系统的设计方案。3.1系统架构设计给出系统的整体架构,包括数据收集、处理、分析及展示模块。3.2数据收集模块设计介绍如何利用网络爬虫技术收集汽车之家网站的舆情数据。3.3数据处理与分析模块设计阐述数据处理流程及舆情分析算法的选择与实现。第4章系统实现与测试介绍系统的实现过程及测试方法,确保系统稳定可靠。4.1系统实现环境列出系统实现所需的软件、硬件环境及开发工具。4.2系统实现过程详细描述系统各模块的实现步骤及代码实现细节。4.3系统测试方法介绍系统测试的方法、测试用例及测试结果分析。第5章研究结果与分析呈现系统运行结果,分析舆情数据,提出见解。5.1舆情数据可视化展示通过图表等形式展示舆情数据的分布、趋势等特征。5.2舆情分析结果解读对舆情分析结果进行解读,提出对汽车行业的见解。5.3对比方法分析将本系统与其他舆情分析系统进行对比,分析优劣。第6章结论与展望总结研究成果,提出未来研究方向。6.1研究结论概括本文的主要研究成果及对汽车之家网站舆情分析的贡献。6.2展望指出系统存在的不足及未来改进方向,展望舆情
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值