Memory – Part 1: Memory Types

Memory – Part 1: Memory Types

Introduction

At Intersec we chose the C programming language because it gives us a full control on what we’re doing, and achieves a high level of performances. For many people, performance is just about using as few CPU instructions as possible. However, on modern hardware it’s much more complicated than just CPU. Algorithms have to deal with memory, CPU, disk and network I/Os… Each of them adds to the cost of the algorithm and each of them must be properly understood in order to guarantee both the performance and the reliability of the algorithm.

The impact of CPU (and as a consequence, the algorithmic complexity) on performances is well understood, as are disk and network latencies. However the memory seems much less understood. As our experience with our customers shows, even the output of widely used tools, such as top, are cryptic to most system administrators.

This post is the first in a series of five about memory. We will deal with topics such as the definition of memory, how it is managed, how to read the output of tools… This series will address subjects that will be of interest for both developers and system administrators. While most rules should apply to most modern operating systems, we’ll talk more specifically about Linux and the C programming language.

We’re not the first ones to write about memory. In particular, we’d like to highlight the high quality paper by Ulricht Drepper: What every programmer should know about memory.

This first post will provide a definition of memory. It supposes at least a basic knowledge of notions such as an address or a process. It also often deals with subjects such as system calls and the difference between user-land and kernel mode, however, what you need to know is that your process (user-land) runs above the kernel that itself talks to the hardware, and that system calls let your process talks to the kernel in order to request more resources. You can get details about the system calls by reading their respective manual pages.

Virtual Memory

On modern operating systems, each process lives in its own memory allocation space. Instead of mapping memory addresses directly to hardware addresses, the operating system serves as a hardware abstraction layer and creates a virtual memory space for each process. The mapping between the physical memory address and the virtual address is done by the CPU using a per-process translation table maintained by the kernel (each time the kernel changes the running process on a specific CPU core, it changes the translation table of that CPU).

Virtual memory has several purposes. First, it allows process isolation. A process in userland can only express memory accesses as addresses in the virtual memory. As a consequence it can only access data that has been previously mapped in its own virtual space and thus cannot access the memory of other processes (unless explicitly shared).

The second purpose is the abstraction of the hardware. The kernel is free to change the physical address to which a virtual address is mapped. It can also choose not to provide any physical memory for a specific virtual address until it becomes actually needed. Moreover it can swap out the memory to disk when it has not been used for a long time and the system is getting short of physical memory. This globally gives a lot of freedom to the kernel, its only constraint is that when the program reads the memory it actually finds what it previously wrote there.

The third purpose is the possibility to give addresses to things that are not actually in RAM. This is the principle behind mmap and mapping files. You can give a virtual memory address to a file so that it can be accessed as if it was a memory buffer. This is a very useful abstraction that helps keeping the code quite simple and, since on 64-bit systems you have a huge virtual space1, if you want, you can map your whole hard drive to the virtual memory.

The fourth purpose is sharing. Since the kernel knows what process is mapped in virtual space of the various running processes, it can avoid loading stuff twice in memory and make the virtual addresses of processes that use the same resources point to the same physical memory (even if the actual virtual address is specific to each process). A consequence of sharing is the use of copy-on-write (COW) by the kernel: when two processes use the same data but one of them modifies it while the other one is not allowed to see the change, the kernel will make the copy when the data get modified. More recently, operating systems have also gained the ability to detect identical memory in several address spaces and automatically make them map to the same physical memory (marking them as subject to COW)2, on Linux this is called KSM (Kernel SamePage Merging).

fork()

The best known use case of COW is fork(). On Unix-like systems, fork() is the system call that creates a process by duplicating the current one. When fork() returns, both processes continue at exactly the same point, with the same opened files and the same memory. Thanks to COW, fork() will not duplicate the memory of a process when you fork it, only data that are modified by either the parent of the child get duplicated in RAM. Since most uses of fork() are immediately followed by a call to exec() that invalidates the whole virtual memory addressing space, the COW mechanism avoids a full useless copy of the memory of the parent process.

Another side effect, is that fork() creates a snapshot of the (private) memory of a process at little cost. If you want to perform some operation on the memory of a process without taking the risk of it to be modified under your feet, and don’t want to add a costly and error-prone locking mechanism, just fork, do your work and communicate the result of your computation back to your parent process (by return code, file, shared memory, pipe, …).

This will work extremely well as long as your computation is fast-enough so that a large part of the memory remains shared between both the parent and the child processes. This also helps keeping your code simple, the complexity is hidden in the virtual-memory code of the kernel, not in yours.

Pages

The virtual memory is divided in pages. The size of a page size is imposed by the CPU and is usually 4KiB3. What this means is that memory management in the kernel is done with a granularity of a page. When you require new memory, the kernel will give you one or more pages, when you release memory, you release one or more pages… Every finer-grained API (e.g. malloc) is implemented in user land.

For each allocated page, the kernel keeps a set of permissions: the page can be readable, writable and/or executable (note that not all combinations are possible). These permissions are set either while mapping the memory or by using the mprotect()system call afterward. Pages that have not been allocated yet, are not accessible. When you try to perform a forbidden action on a page (for example, reading data from a page without the read permission), you’ll trigger (on Linux) a Segmentation Fault. As a side note, you may see that since the segmentation fault has a granularity of a page, you may perform out-of-buffer accesses that don’t lead to a segfault.

Memory Types

Not all memory allocated in the virtual memory space is the same. We can classify it through two axis: the first axis is whether memory is private (specific to that process) or shared, the second axis is whether the memory is file-backed or not (in which case it is said the be anonymous). This creates a classification with 4 memory classes:

  PRIVATE SHARED
ANONYMOUS

1

2 

FILE-BACKED

3

4

Private Memory

Private memory is, as its name says, memory that is specific to the process. Most of the memory you deal with in a program is actually private memory.

Since changes made in private memory are not visible to other processes, it is subject to copy-on-write. As a side-effect, this means that even if the memory is private, several processes might share the same physical memory to store the data. In particular, this is the case for binary files and shared libraries. A common misbelief is that KDE takes a lot of RAM because every single process loads Qt and the KDElibs, however, thanks to the COW mechanism, all the processes will use the exact same physical memory for the read-only parts of those libs.

In case of file-backed private memory, the changes made by the process are not written back to the underlying file, however changes made to the file may or may not be made available to the process.

Shared Memory

Shared memory is something designed for inter-process communication. It can only be created by explicitly requesting it using the right mmap() call or a dedicated call (shm*). When a process writes in a shared memory, the modification is seen by all the processes that map the same memory.

In case the memory is file-backed, any process mapping the file will see the changes in the file since those changes are propagated through the file itself.

Anonymous Memory

Anonymous memory is purely in RAM. However, the kernel will not actually map that memory to a physical address before it gets actually written. As a consequence, anonymous memory does not add any pressure on the kernel before it is actually used. This allows a process to “reserve” a lot of memory in its virtual memory address space without really using RAM. As a consequence, the kernel lets you reserve more memory than actually available. This behavior is often referenced as over-commit (or memory overcommitment).

File-backed and Swap

When a memory map is file-backed, the data is loaded from the disk. Most of the time, it is loaded on demand, however, you can give hints to the kernel so that it can prefetch memory ahead of read. This helps keeping your program snappy when you know you have a particular pattern of accesses (mostly sequential accesses). In order to avoid using too much RAM, you can also tell the kernel that you don’t care to have the pages in RAM anymore without unmapping the memory. All this is done using themadvise() system call.

When the system falls short of physical memory, the kernel will try to move some data from RAM to the disk. If the memory is file-backed and shared, this is quite easy. Since the file is the source of the data, it is just removed from RAM, then the next time it will be read, it will be loaded from the file.

The kernel can also choose to remove anonymous/private memory from RAM. In which case that data is written in a specific place on disk. It’s said to be swapped out. On Linux, the swap is usually stored in a specific partition, on other systems this can be specific files. Then, it just works the same way it works for file-backed memory: when it gets accessed, it is read from the disk and reloaded in RAM.

Thanks to the use of a virtual addressing space, swapping pages in and out is totally transparent for the process… what is not, though, is the latency induced by the disk I/O.

Next: Resident Memory and Tools

We’ve covered here some important notions about memory. While we talked a few times about physical memory and the difference with reserved address spaces, we avoided dealing with the actual memory pressure of the process. We will address that topic and describe some tools that let you understand the memory consumption of a process in the next article.


  1. Nowadays, consumer CPUs have 48bits of addressing, which means 248 bytes addressable, that is 256TiB 

  2. This is useful in case you have a large amount of memory present several times, for example when you run a virtualization server and have several virtual machines running the same OS. 

  3. On Intel CPU, there is a second alternative size for the size: 2MiB, however on Linux the number of 2MiB pages available is limited and requires mapping a pseudo-file, which makes them awkward to use. 

bookmark the  permalink. memory  virtual memory

About Florent Bruneau

"When you do things right, people won't be sure you've done anything at all." - Futurama, 3x20 Godfellas
id="dsq-app2" name="dsq-app2" allowtransparency="true" frameborder="0" scrolling="no" tabindex="0" title="Disqus" width="100%" src="https://disqus.com/embed/comments/?base=default&version=4ba89944d98e8cf9befa2b1a08fe20a4&f=techtalk-intersec&t_i=22%20http%3A%2F%2Fblog.intersec.com%2F%3Fp%3D22&t_u=https%3A%2F%2Ftechtalk.intersec.com%2F2013%2F07%2Fmemory-part-1-memory-types%2F&t_e=Memory%20%26%238211%3B%20Part%201%3A%20Memory%20Types&t_d=Memory%20%E2%80%93%20Part%201%3A%20Memory%20Types%20%7C%20Intersec%20TechTalk&t_t=Memory%20%26%238211%3B%20Part%201%3A%20Memory%20Types&s_o=default&l=" horizontalscrolling="no" verticalscrolling="no" style="font-family: inherit; font-style: inherit; font-weight: inherit; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline; max-width: 100%; min-width: 100%; border-style: none !important; border-width: initial !important; width: 1px !important; overflow: hidden !important; height: 1075px !important;">
<think>我们之前讨论过如何获取xc7a35tcpg236-1的引脚图,其中提到在Vivado中使用Tcl命令`get_package_pins`。现在用户遇到了一个具体的错误:ERROR:[Coretcl2-67] Invalidobject typefor 'of_objects'问题分析:这个错误通常是因为`get_package_pins`命令的`-of_objects`参数后面跟的对象类型不正确。在之前的示例中,我们使用了:get_package_pins -filter {NAME =~ *} -of_objects [get_parts xc7a35tcpg236-1]但是,`get_parts`命令返回的是器件(part)对象,而`get_package_pins`命令的`-of_objects`参数要求的是封装(package)对象吗?实际上,在Vivado中,`get_package_pins`命令并不需要必须通过`-of_objects`来指定器件。我们可以直接通过`-filter`来指定器件型号,或者先通过`get_parts`获取器件对象,再通过它来获取引脚。然而,查看VivadoTcl命令参考手册,`get_package_pins`命令的`-of_objects`参数要求的是一个设计对象(比如一个cell或一个port),而不是一个part对象。因此,我们之前的命令有误。正确方法:方法1:直接使用`get_parts`获取器件,然后使用`get_package_pins`时不要用`-of_objects`,而是用`-filter`来限定所属器件。方法2:在打开设计后,通过当前设计的器件来获取引脚。但是,我们可能还没有打开设计。所以,我们可以先设置当前器件(set_part),然后获取该器件的引脚。步骤:1.设置当前器件:set_part xc7a35tcpg236-12.获取该器件的所有封装引脚:get_package_pins或者,我们可以通过`get_parts`命令获取器件对象,然后使用`get_package_pins`的`-filter`参数来指定器件。具体命令如下:方法一(推荐):setpart[get_parts xc7a35tcpg236-1]get_package_pins-filter{PART_NAME== $part}方法二(更简单):get_package_pins -filter {REF_PACKAGE== "cpg236"}注意:封装名称是"cpg236",因为xc7a35tcpg236-1的封装是cpg236。方法三(设置当前器件后):set_partxc7a35tcpg236-1get_package_pins但是,方法三要求先设置器件,如果还没有设置,可能会出错。因此,我建议使用方法一或方法二。另外,如果我们需要详细的引脚信息(包括名称、位置、bank等),可以使用`report_property`命令来报告每个引脚的属性,或者使用`get_property`获取特定属性。例如,获取所有引脚及其位置(Ball):foreach pin[get_package_pins]{setball[get_propertyBALL $pin]puts"$pin:$ball"}但是,用户可能只需要一个列表,我们可以直接导出为CSV文件。完整步骤:1.在Vivado Tcl控制台或Vivado设计中的Tcl控制台,执行以下命令:#设置当前器件(可选,但确保我们操作的是正确的器件)set_partxc7a35tcpg236-1#获取所有封装引脚set pins[get_package_pins]#打开一个文件用于写入setfile [open "pinout.csv"w]puts$file"PinName,Ball,Bank,IO Standard,OtherProperties"#遍历每个引脚,获取其属性foreach pin$pins {setball [get_property BALL $pin]setbank [get_property BANK$pin]set iostd[get_propertyIOSTANDARD$pin]#注意:在没有分配I/O标准时,这个可能是空的#如果需要更多属性,可以继续添加puts $file "$pin,$ball,$bank,$iostd"}close$file这样就会在当前目录生成一个CSV文件。但是,请注意:在没有打开设计或者没有分配I/O的情况下,`get_property IOSTANDARD`可能获取不到值。所以,我们可能只关心引脚的物理位置和Bank,那么上面的命令就足够了。另外,我们也可以使用Vivado的图形界面工具“Pin Planner”来查看引脚分配,并导出为CSV。如果用户没有设计打开,可以创建一个临时设计并设置器件,然后运行上述Tcl脚本。创建临时设计的步骤(在Tcl中):create_project-partxc7a35tcpg236-1-in_memorytemporary#然后运行上面的脚本但是,如果用户只是需要引脚图(即物理位置和引脚名称的对应),那么我们可以不创建设计,因为封装引脚是器件固有的。修正后的命令(无需创建设计):#设置当前器件(必须)set_part xc7a35tcpg236-1#获取所有引脚get_package_pins但是,我们之前的方法一和方法二也可以在没有设置器件的情况下使用吗?实际上,`get_parts`命令总是可用的,因为它不依赖于设计。因此,我们可以这样写:setpart [get_parts xc7a35tcpg236-1]setpins [get_package_pins -filter "PART_NAME== $part"]然后,再获取每个引脚的属性。但是,`get_property`命令在没有设计的情况下也能工作吗?对于封装引脚的属性(如BALL,BANK)是器件的物理属性,不依赖于设计,所以可以获取。所以,我们可以在没有打开设计的情况下运行:set part[get_partsxc7a35tcpg236-1]set pins[get_package_pins-filter"PART_NAME ==$part"]foreach pin$pins {setball [get_property BALL $pin]setbank [get_property BANK$pin]puts"$pin:$ball(Bank$bank)"}这样就能列出所有引脚及其位置和Bank。总结:错误原因在于`-of_objects`参数后面跟的是`get_parts`返回的器件对象,而`get_package_pins`命令的`-of_objects`参数期望的是其他类型的对象(如端口或单元)。因此,我们改用`-filter`参数来筛选器件。现在,我们给出一个完整的解决方案。另外,如果用户需要详细的引脚图,还是建议查阅官方文档UG475(Artix-7 Packagingand Pinout),因为Tcl输出的信息可能不够直观(没有图形布局)。但是,用户的需求是解决命令错误并获取引脚分配,所以我们提供Tcl的正确使用方法。步骤总结:1.启动Vivado,打开Tcl控制台。2.输入以下命令(无需打开设计):set part[get_partsxc7a35tcpg236-1]set pins[get_package_pins-filter"PART_NAME ==$part"]#然后可以打印或导出这些引脚的信息3.如果需要导出到CSV,按照上面的循环脚本。如果用户不想写Tcl循环,也可以使用Vivado的图形界面“Pin Planner”来查看和导出。在Pin Planner中:-创建或打开一个设计(设置器件为xc7a35tcpg236-1)-在Flow Navigator中点击OpenElaboratedDesign(如果设计为空,可能需要先创建一个顶层文件)-然后打开PinPlanner,就可以看到整个封装的引脚图,并可以导出为CSV。但是,图形界面的步骤可能更复杂,而Tcl命令更快捷。因此,我们提供两种方式。最后,针对用户遇到的错误,我们给出修正后的命令:错误命令:get_package_pins -filter {NAME =~ *} -of_objects [get_parts xc7a35tcpg236-1]修正命令(不需要-of_objects):get_package_pins-filter{NAME=~*}但是,这样会获取所有器件(如果有多个器件)的引脚,所以我们需要用`-filter`限定器件:get_package_pins -filter {PART_NAME =="xc7a35tcpg236-1"}或者,更高效地:set part[get_partsxc7a35tcpg236-1]get_package_pins -filter "PART_NAME== $part"注意:在Tcl中,变量替换使用双引号。所以,用户可以直接运行:get_package_pins -filter {PART_NAME== "xc7a35tcpg236-1"}这样就能得到该器件的所有引脚。然后,再进一步获取每个引脚的属性。另外,如果用户需要查找特定引脚的属性,可以使用:get_propertyBALL [get_package_pins "引脚名称"]例如:get_propertyBALL[get_package_pins"A10"]但是,请注意,引脚名称在Vivado中通常是类似“A10”这样的,但实际在`get_package_pins`返回的列表中,名称可能是完整的名称(如“IO_L1P_13”),也可能就是位置名称(如“A10”)。这取决于我们如何查看。实际上,`get_package_pins`返回的是引脚的“名称”,这个名称就是它在封装上的位置名称(如A10,B10等)。我们可以通过`get_property NAME`来获取,但通常我们直接用返回的字符串就是位置名称。因此,我们可以认为`get_package_pins`返回的就是引脚的位置名称(如A10,B10等)。但是,在Vivado中,同一个物理引脚可能有多个功能名称(比如在原理图中可能是IO_L1P_13),而物理位置名称(Ball)是唯一的。所以,`get_package_pins`返回的是物理位置名称。所以,我们上面的循环中,`$pin`就是物理位置名称(如A10),而`get_propertyBALL`返回的也是同一个名称(即A10)?实际上,`BALL`属性就是物理位置名称,所以两者相同。那么,我们如何获取功能名称呢?功能名称(如IO_L1P_13)是逻辑资源名称,可以通过`get_site_pins`获取,但这需要设计中有使用到该引脚,或者通过器件的定义文件。如果用户需要的是官方文档中那种功能名称与物理位置的对应,那么必须查阅UG475文档,因为Tcl命令无法直接给出(因为一个物理引脚可能对应多个功能名称,取决于配置)。因此,我们建议用户同时结合官方文档和Vivado工具。综上所述,我们修正命令错误,并提供两种获取引脚信息的方法(Tcl命令和图形界面)。</think>您在使用 Vivado Tcl 命令 `get_package_pins` 时遇到的错误 `[Coretcl 2-67]` 是由于命令语法使用不当导致的。以下是详细解决方案: ### 错误原因分析 1. **无效的对象类型** `-of_objects` 参数要求的是 **设计对象**(如 cell 或 port),但 `[get_parts]` 返回的是 **器件对象**,两者不兼容。 2. **未设置当前器件** 在查询引脚前需先指定目标器件,否则 Vivado 无法识别操作对象。 ### 正确解决步骤 #### 步骤 1:设置当前器件 ```tcl # 在 Vivado Tcl 控制台执行 set_part xc7a35tcpg236-1 ``` #### 步骤 2:获取封装引脚(修正语法) ```tcl # 方法 1:直接获取所有引脚(推荐) get_package_pins # 方法 2:通过器件对象过滤 set part_obj [get_parts xc7a35tcpg236-1] get_package_pins -filter "PART_NAME == [get_property NAME $part_obj]" ``` #### 步骤 3:导出引脚信息到 CSV ```tcl # 生成引脚报告 report_package_pins -file "xc7a35t_pins.rpt" # 导出为 CSV(需先打开设计) create_project -in_memory temp_design set_property part xc7a35tcpg236-1 [current_project] write_csv pinout.csv -force ``` ### 关键注意事项 1. **封装规格验证** CPG236 封装的引脚排列为 **115 球栅阵列**,中心为散热焊盘: $$ \text{总引脚数} = (15 \times 15) - 11_{\text{中心空位}} = 214_{\text{可用}} + 22_{\text{电源/地}} $$ 可通过以下命令验证: ```tcl get_property REF_PACKAGE [get_parts xc7a35tcpg236-1] # 应返回 "cpg236" ``` 2. **Bank 电压域查询** 获取 Bank 12 的电压标准: ```tcl get_property VCCAUX_IO [get_iobanks 12] # 通常返回 1.8V 或 2.5V ``` ### 备选方案(无需 Tcl) 1. **图形界面操作** - 在 Vivado 中:`Tools → I/O Planning → Open I/O Planner` - 搜索栏输入 `xc7a35tcpg236-1` 直接查看引脚图 2. **官方文档查阅** 在 [UG475 Artix-7 Packaging and Pinout](https://docs.xilinx.com/r/en-US/ug475_7Series_Pkg_Pinout) 文档中: - CPG236 封装引脚图:**Table 5-1 (Page 53)** - 配置引脚定义:**Section 3.3 (Page 23)** ### 错误命令 vs 修正命令对比 | 错误命令 | 修正命令 | 说明 | |---------|---------|------| | `get_package_pins -of_objects [get_parts ...]` | `get_package_pins -filter "PART_NAME == ..."` | `-filter` 替代无效的 `-of_objects` | | 未设置器件 | `set_part xc7a35tcpg236-1` | 必须先指定操作对象 | > **提示**:对于配置引脚(如 `PROG_B`, `M0/M1/M2`),优先使用文档推荐的 **Bank 0** 专用引脚[^1]。若需引脚功能复用,需在约束文件中添加 `set_property IOBANK_VCCAUX 1.8 [current_design]` 声明电压域。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值