[文章翻译] Software Controls Cache Memory to Speed CPUs

一项新的进程在管理CPU内部快速访问内存方面取得了显著进步,实现了高达2倍的加速和最高达72%的能源使用减少。设计者认为,要实现如此惊人的成果,需要在计算机控制cache管理上做出重大改变:传统上由CPU硬件控制,但现在让操作系统接手管理可以获得更大的加速比。

报道地址在这里:http://spectrum.ieee.org/semiconductors/memory/software-controls-cache-memory-to-speed-cpus

该链接当中同时也是提供了PACT‘13上发表的这篇论文;

副标题:

Letting the operating system control cache memory management saves power too

让操作系统管理cache还会带来节能效果;

进入正文:

A new process for managing the fast-access memory inside a CPU has led to as much as a twofold speedup and to energy-use reductions of up to 72 percent. According to its designers, realizing such stunning gains requires a big shift in what part of the computer controls this crucial memory: Right now that control is hard-wired into the CPU’s circuitry, but the substantial speedup came when the designers let the operating system handle things instead.

一种新的在cache管理的的进展获得了2X的加速,同时将能耗降低72%.据设计者介绍,这种出色的(性能)收益需要对于cache管理思路作出大幅调整:传统的cache管理都是基于CPU硬件,但是如果让OS来接手控制cache,那么就会获得大幅度的加速比;

(这里原文有张配图,是关于AMD的CPU架构图:)

Multicore memory: More processor cores—this AMD Opteron has six—means the computer has a harder time managing how memory moves into and out of the processor’s cache.

The CPU uses high-speed internal memory caches as a kind of digital staging area. Caches are a CPU’s workbench, whether they’re holding onto instructions a CPU may need soon or data it may need to crunch. And from smartphones to servers, nearly every CPU today manages the flow of bits in and out of its caches using algorithms built into its own circuits.

 CPU使用高速中间缓存cache作为一种数据缓冲区域。cache就是CPU的工作台,不论cache中是否存有CPU即将使用的指令或者CPU可能用到的数据。从智能手机到服务器,基于所有的CPU架构当中管理CPU cahce的逻辑都是写入电路固件当中了。

But, say two MIT researchers, as computers and portable devices accumulate more and more memory and CPU cores, it makes less and less sense to leave cache management entirely up to the CPU. Instead, they say, it might be better to let the operating system share the burden.

但是,两个来自MIT的熊孩子,发现随着从计算机到便携设备当中CPU核数以及内存不断增长,将所有cache管理工作完全交由CPU硬件来管理不再合理;相反,他们认为该轮到OS出场来承担cache的管理任务;

In itself, this idea is not completely new. Some of IBM’s Cell processors, as well as Sony’s PlayStation 3—which runs on Cell technology—allow their applications and OS kernels to fiddle with low-level CPU memory management. What’s new about the MIT technology, called Jigsaw, is its middle-ground approach, which enables software to configure some on-chip memory caches but without requiring so much control that programming becomes a memory-management nightmare.

 

这种想法在事实上上并非全新。一些IBM Cell处理器 以及索尼的PlayStation 3,使用了Cell技术来允许应用程序和OS内核来操纵底层的CPU内存管理。而MIT的做法Jigsaw,则是一种折中的做法,它使得程序只需要做对于部分片上缓存进行配置,而不必获取更多权限去完成所有的内存管理工作;

“If you go back six or seven years, you’ll see that everybody was complaining that they launched the PlayStation 3 and nobody could program it well,” says Daniel Sanchez, the assistant professor at MIT’s Computer Science and Artificial Intelligence Laboratory and one of the inventors of Jigsaw.

“如果你倒回6、7年前,你会发现人人都在抱怨当他们载入运行PlayStation 3后没有人能正常进行编程”,出自Daniel Sanchez,MIT计算机系 AI实验室的助理教授,同时也是Jigsaw的作者之一。

Today, CPU hardware typically controls all the on-chip caches. So those caches must be designed to handle every conceivable job, from pure floating-point number crunching (which places a small burden on caches) to intensive searches and queries of a computer’s memory banks (which can stretch their limits). Moreover, CPUs have no higher-level knowledge of the kinds of jobs they’re doing. This means a self-contained numerical simulation with complex equations but little need for memory access would run with exactly the same cache resources as would a graph search, a memory-hogging hunt for relationships between stored data.

 

如今,CPU硬件控制着所有的片上缓存。而这些cahce被设计用于能够处理各种可能的任务,从纯浮点数计算到密集的搜索查询操作。并且,CPU对于他们所处理的任务不加区分,这就意味着对于那些独立的包含复杂方程计算的数字模拟计算,即使他们几乎不需要访存操作,但是在运行特征上也会像图形关系检索这种耗内存的操作一样,使用相同的cache管理策略。

So Sanchez and his graduate student Nathan Beckmann thought, Why not let the OS trim the cache size for pure computation and swell its ranks for graph search?

有鉴于此,Sanchez教授和他的学生Nathan Beckmann认为,为什么不利用OS来将分配给纯计算的cache规模进行压缩,而将节省出来的部分补充给耗存图的搜索操作呢?

The first step, they say, would be to give perhaps 1 percent of the CPU’s footprint to a simple piece of hardware that could monitor in real time the cache activity in each core. Hardware cache monitors would give Jigsaw the independent oversight it would need to play air traffic controller with the CPU’s caches.

 第一步,据这两位作者描述,他们会CPU中大约1%的存储空间分配个一个简单的硬件用于实时监控各个CPU核中的cache活动。硬件的cache监视机器会给Jigsaw独立的视角,以便他可以方便监视CPU中所有的chache;[没翻译好]

Second, Sanchez and Beckmann say, the OS’s kernel needs at most a few thousand more lines of code. That’s not much of an addition, considering that Linux’s kernel in 2012 weighed in with 15 million lines and Apple’s and Microsoft’s kernels unofficially contained tens of millions more than that.

在此之后,Sanchez和Beckmann宣称,OS的内核需要至多数万行的代码。that's not much of an addition, 考虑到在2012年时Linux 内核拥有1500万行代码量,而Apple和Microsoft的内核代码量据非官方估计也至少得有数千万行。

One of Jigsaw’s more prominent features is a software module, to be folded in with the OS, that the researchers call Peekahead. This module was adapted from the Lookahead Cache, developed more than a decade ago by Beijing computer scientists. Peekahead computes the best configuration of CPU caches based on the upcoming jobs it expects the cores to do in the coming clock cycles.

Jgisaw当中一个著名的特征是一个名为Peekahead的封装于OS当中的软件模块。该模块由Lookahead Cache来调用。 而前向cache 这项技术则是由Beijing的计算机科学家在十年前发明的。Peekahead是根据在下一时钟周期处理当前即将到来的计算任务的处理核的信息,来对于CPU当中cache进行一个最优的配置。

 

“When you let software be in charge, you have to be careful of your overhead,” Sanchez says. A poorly designed cache management system, he says, might trim the cache to its optimum size and do it again every fraction of a second. But doing so taxes the CPU. And what’s the point of a CPU efficiency algorithm that requires extraordinary amounts of CPU time? “The exact solution is really expensive. So we have to come up with a quick way of getting the job done so that the overhead doesn’t negate the gains you get,” he says.

 “当你使用软件进行控制,你就必须留心引入的新的负载”,Sachez介绍道。一个设计糟糕的cache管理系统,可能会经常将cache裁剪为最佳的规模,但是此举可能会增加CPU负担。一个有效利用CPU的算法是不能消耗大量CPU时间的。否则这种解决方案成本高昂。所以我们必须设计出一种带来整体性能提升的算法。

Linley Gwennap of the Linley Group, a semiconductor consulting firm based in Mountain View, Calif., says he’s impressed with Jigsaw but cautions that it’s not quite ready for chip-fab prime time. “The problem is generally that a scheme that’s effective on one processor may not be effective on another processor with a different hardware design,” he says. “Every time the processor changes, you have to redo your software, which customers generally don’t like.”

 位于Calif Mountain View的半导体咨询公司Linley Gwennap,对于Jigsaw持保留意见,认为它并不适合当前芯片领域;“问题在于这种机制只是在一种处理器上适用,但是未必适合其他种类的处理器。而且一旦处理器的设计改变,你必须重新设计软件,而这样是用户所不愿看到的“。

Sanchez counters that software applications and utilities would remain unaffected by Jigsaw. “Only the operating system code needs to be aware of that intimate knowledge of the hardware, like the topology of the different portions of the cache,” he says.

 作为回应,Sanchez认为软件应用等对于Jigsaw并无影响。“只有OS需要与底层硬件进行肉搏,例如chace的拓扑结构”。

Jason Mars, an assistant professor of computer science at the University of Michigan, says Jigsaw works well as a proof of concept, which he says chipmakers might adapt as they see fit.

 来自UM的围观群众Jason Mars(不是那位I‘m yours的歌手???),认为Jigsaw作为POC是没有问题的,而各个芯片制造商可以考虑是否将其采用该技术;

“The crisp novelty in this work has to do with the codesign between hardware and software,” Mars says. “Much of the prior work was biased in one direction. More was expected to be done in hardware, and there was a little bit less flexibility. Jigsaw really...builds a holistic system that spans both the hardware and the software.”

 “这项工作的创新之处在于需要系统设计硬件和软件。”Mars补充说,“大多以往的工作都是偏重于一个方向。更多的是期望在硬件上做功课,但是那种方式缺乏灵活性。 Jigsaw却是做了通盘的考虑,使得该系统可以在软件和硬件两方面进行扩展。”

 

=======================

 

 

 

转载于:https://www.cnblogs.com/superniaoren/p/3402134.html

<think>好的,用户想通过docker-compose.yml文件来调整容器的内存和CPU限制。我需要根据提供的引用内容来解答这个问题。首先,用户之前的问题是关于容器在1秒内停止运行的排查步骤,其中提到了资源限制可能导致的问题,现在用户进一步想了解如何设置这些资源限制。 根据引用2和引用3,用户已经给出了一些配置示例,其中提到了使用deploy.resources下的limits来设置内存和CPU。需要注意的是,用户提到要使用docker-compose而不是swarm,所以可能需要加上--compatibility参数,如引用3中的例子。此外,用户可能遇到的问题是配置是否正确,或者资源限制是否合理,导致容器无法启动或运行不稳定。 需要确认的是docker-compose的版本,因为deploy选项在version 3的格式中是为swarm设计的,但在单机模式下使用需要添加--compatibility参数。同时,用户可能需要知道如何验证配置是否生效,比如使用docker stats命令查看资源使用情况,如引用3中提到的。 另外,用户可能在配置过程中遇到问题,比如容器启动失败,这时候需要检查日志和资源限制是否合理。例如,内存设置过小可能导致应用崩溃,CPU限制过严可能影响性能。需要提醒用户注意这些潜在问题,并给出验证方法,比如进入容器测试或者查看资源使用情况。 可能还需要补充一些常见错误的解决方法,比如权限问题、配置格式错误等。此外,用户可能希望了解如何同时限制多个服务,或者如何处理不同版本docker-compose的差异。需要确保提供的配置示例正确,并且解释清楚每个参数的作用,比如memorycpus的设置方式,以及reservations和limits的区别。</think>### 通过`docker-compose.yml`配置容器资源限制的完整指南 #### 一、基础配置方法(适用于单机模式) ```yaml version: "3.8" services: app-service: image: your_image:tag deploy: # 注意:此配置需要添加--compatibility参数运行 resources: limits: cpus: '0.5' # 最多使用50%的CPU时间 memory: 700M # 内存硬限制 reservations: cpus: '0.25' # 保证至少25%的CPU时间 memory: 256M # 内存软限制 ``` #### 二、关键参数说明 1. **CPU限制**: - `cpus`值支持小数格式:`1.5`表示1.5个CPU核心 - 使用百分比模式:`0.5`表示最大使用50%的CPU时间片[^3] 2. **内存限制**: - 单位支持:`b/k/m/g`(字节/KB/MB/GB) - `memory`设置硬限制,超过即触发OOM Killer - `memory_reservation`设置弹性保留内存[^2] #### 三、启动命令注意事项 ```bash # 必须添加--compatibility参数 docker-compose --compatibility up -d # 验证资源配置 docker stats <容器ID> ``` #### 四、完整配置示例 ```yaml version: "3.8" services: webapp: image: nginx:alpine deploy: resources: limits: cpus: '1.5' memory: 2G reservations: cpus: '0.5' memory: 1G ports: - "80:80" database: image: postgres:13 deploy: resources: limits: cpus: '2' memory: 4G environment: POSTGRES_PASSWORD: example ``` #### 五、常见问题排查 1. **配置未生效**: - 检查docker-compose版本:`docker-compose version` - 确认已添加`--compatibility`参数 - 查看实际配置:`docker inspect <容器ID> | grep -iE "cpu|memory"` 2. **容器异常退出**: ```bash # 查看OOM事件 dmesg | grep -i "oom" # 检查内存使用峰值 docker stats --no-stream <容器ID> ``` 3. **性能下降明显**: - 通过`docker stats`观察CPU利用率 - 建议逐步调整限制值进行压测 #### 六、最佳实践建议 1. **生产环境基准测试**: - 建议预留20%的CPU余量 - 内存限制应大于JVM堆内存(Java应用场景) 2. **混合配置示例**: ```yaml services: java-app: image: my-java-app deploy: resources: limits: cpus: '3.0' # 对应--cpus=3 memory: 8G # 对应--memory=8g environment: - JAVA_OPTS=-Xmx6g -Xms2g # 必须小于内存限制 ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值