Computer Systems: A programmer's perspective

最新推荐文章于 2025-07-15 07:39:22 发布

yazhouren

最新推荐文章于 2025-07-15 07:39:22 发布

阅读量1.4k

点赞数

CC 4.0 BY-SA版权

分类专栏：技术类书籍笔记文章标签： user c

本文链接：https://blog.youkuaiyun.com/yazhouren/article/details/7743953

技术类书籍笔记专栏收录该内容

84 篇文章

订阅专栏

本文深入探讨了链接过程及虚拟内存的工作原理，包括不同类型的对象文件、加载器的工作方式、动态链接的优点及其应用场景，以及虚拟内存如何通过硬件和软件交互提供统一的地址空间。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

听说这本书是本经典的书籍，以前宿舍的同学总是抱着这本书来看，我匆匆的看过目录，发现介绍的是很全面，但是太粗浅，只能有个大概的认识。所以不想去读。后来，看到<程序员的修养：编译，链接和库>的作者俞甲子也推荐这本书，于是便有了看一看的冲动。里面的知识，多多少少的我都知道一些，所以就当做梳拢一下知识好了。

2012.7.13

Files such as hello.c that consist exclusively of ASCII characters are known as text files. All other files are known as binary files. 除了字符文件，就是二进制文件。是这样吗？

C is the language of choice for system-level programming,Newer languages such as C++ and Java address these issues for application-level programs.
虚拟内存的分布:

-------------------------------------------
* Kernel virtual memory *
-------------------------------------------
* User Stack *
* (created at runtime) *
* | *
* \|/ *
--------------------------------------------

* Memory mapped region for *
* shared libraries *
--------------------------------------------
* /|\ *
* | *
* Heap *
* (created at runtime by malloc)*
-------------------------------------------
* Data *
-------------------------------------------
* Text *
------------------------------------------

2012.7.13

为什么叫做Bool布尔呢？原来是为了纪念发明布尔计算的人George Boole
第二章讲数的，各种数值，二进制，补码，整型，浮点型..
2012.7.24
3.1节 As mentioned earlier, the program memory is addressed using virtual addresses.
VM和LM指程序执行地址和load地址，不一定是virtual address,也可能是physical address.
3.9.3 The IA32 hardware will work correctly regardless of the alignment of data.However, Intel recommends that data be aligned to improve memory system performance. Linux follows an alignment policy where 2-byte data types (e.g., short) must have an address that is a multiple of 2, while any larger data types (e.g., int, int *, float, and double) must have an address that is a multiple of 4. 这节中，有许多对齐的东东
3.11 GNU DDD是命令行调试程序，如GDB、DBX、WDB、Ladebug、JDB、XDB、Perl Debugger或Python Debugger的可视化图形前端。
3.12.1 Stack Randomization 和Stack Corruption Detection，在linux gcc 中，用来防止内存泄露攻击
第4、5、6章跳过
7. Linking is the process of collecting and combining various pieces of code and
data into a single file that can be loaded (copied) into memory and executed.
Linking can be performed at compile time, when the source code is translated
into machine code; at load time, when the program is loaded into memory and
executed by the loader; and even at run time, by application programs. On early
computer systems, linking was performed manually. On modern systems, linking
is performed automatically by programs called linkers.
This chapter provides a thorough discussion of all aspects of linking, from
traditional static linking, to dynamic linking of shared libraries at load time,
to dynamic linking of shared libraries at run time.

7.1 Most compilation systems provide a compiler driver that invokes the language preprocessor, compiler, assembler, and linker, as needed on behalf of the user.
7.3 Object files come in three forms:
. Relocatable object file. Contains binary code and data in a form that can be
combined with other relocatable object files at compile time to create an
executable object file.
. Executable object file. Contains binary code and data in a form that can be
copied directly into memory and executed.
. Shared object file.A special type of relocatable object file that can be loaded
into memory and linked dynamically, at either load time or run time.
Compilers and assemblers generate relocatable object files (including shared
object files). Linkers generate executable object files.
7.4 How do loaders really work?
Our description of loading is conceptually correct, but intentionally not entirely accurate.To understand
how loading really works, you must understand the concepts of processes, virtual memory, and memory
mapping, which we haven’t discussed yet. As we encounter these concepts later in Chapters 8 and 9,
we will revisit loading and gradually reveal the mystery to you.
For the impatient reader, here is a preview of how loading really works: Each program in a Unix
system runs in the context of a process with its own virtual address space.When the shell runs a program,
the parent shell process forks a child process that is a duplicate of the parent. The child process invokes
the loader via the execve system call. The loader deletes the child’s existing virtual memory segments,
and creates a new set of code, data, heap, and stack segments. The new stack and heap segments are
initialized to zero. The new code and data segments are initialized to the contents of the executable
file by mapping pages in the virtual address space to page-sized chunks of the executable file. Finally,
the loader jumps to the _start address, which eventually calls the application’s main routine. Aside
from some header information, there is no copying of data from disk to memory during loading. The
copying is deferred until theCPUreferences a mapped virtual page, at which point the operating system
automatically transfers the page from disk to memory using its paging mechanism.
the loader不会把所有的程序加载进入内存，只是当缺页是，才进行内存加载!
7.10 Shared libraries are modern innovations that address the disadvantages of
static libraries. A shared library is an object module that, at run time, can be
loaded at an arbitrary memory address and linked with a program in memory.
This process is known as dynamic linking and is performed by a program called a
dynamic linker.
Shared libraries are also referred to as shared objects
Shared libraries are “shared” in two different ways. First, in any given file
system, there is exactly one .so file for a particular library. The code and data in
this .so file are shared by all of the executable object files that reference the library,
as opposed to the contents of static libraries, which are copied and embedded in
the executables that reference them. Second, a single copy of the .text section of
a shared library in memory can be shared by different running processes
7.11 Up to this point, we have discussed the scenario in which the dynamic linker loads
and links shared libraries when an application is loaded, just before it executes.
However, it is also possible for an application to request the dynamic linker to
load and link arbitrary shared libraries while the application is running, without
having to link in the applications against those libraries at compile time.
Dynamic linking is a powerful and useful technique. Here are some examples
in the real world:
. Distributing software. Developers of Microsoft Windows applications frequently
use shared libraries to distribute software updates. They generate
a new copy of a shared library, which users can then download and use as a
replacement for the current version. The next time they run their application,
it will automatically link and load the new shared library.
. Building high-performanceWeb servers.ManyWeb servers generate dynamic
content, such as personalized Web pages, account balances, and banner ads.
Early Web servers generated dynamic content by using fork and execve
to create a child process and run a “CGI program” in the context of the
child. However, modern high-performanceWeb servers can generate dynamic
content using a more efficient and sophisticated approach based on dynamic
linking.
The idea is to package each function that generates dynamic content in
a shared library. When a request arrives from a Web browser, the server
dynamically loads and links the appropriate function and then calls it directly,
as opposed to using fork and execve to run the function in the context of a
child process. The function remains cached in the server’s address space, so
subsequent requests can be handled at the cost of a simple function call. This
can have a significant impact on the throughput of a busy site. Further, existing
functions can be updated and new functions can be added at run time, without
stopping the server.
dlopen,dlsym,dlclose,dlerror用来runtime动态链接
7.12 A better approach is to compile library code so that it can be loaded and
executed at any address without being modified by the linker. Such code is known
as position-independent code (PIC). Users direct GNU compilation systems to
generate PIC code with the -fPIC option to gcc.
7.13 Tools for Manipulating Object Files
ar: Creates static libraries, and inserts, deletes, lists, and extracts members.
strings: Lists all of the printable strings contained in an object file.
strip: Deletes symbol table information from an object file.
nm: Lists the symbols defined in the symbol table of an object file.
size: Lists the names and sizes of the sections in an object file.
readelf: Displays the complete structure of an object file, including all of the
information encoded in the ELF header; subsumes the functionality of
size and nm.
objdump: The mother of all binary tools. Can display all of the information in an
object file. Its most useful function is disassembling the binary instructions
in the .text section.
Unix systems also provide the ldd program for manipulating shared libraries:
ldd: Lists the shared libraries that an executable needs at run time.
8.2 The classic definition of a process is an instance of a program in execution.
Each program in the system runs in the context of some process. The context
consists of the state that the program needs to run correctly. This state includes the
program’s code and data stored in memory, its stack, the contents of its generalpurpose
registers, its program counter, environment variables, and the set of open file descriptors.

If two flows overlap in time, then they are concurrent, even if they are running on the same processor.
However, we will sometimes find it useful to identify a proper subset of concurrent
flows known as parallel flows. If two flows are running concurrently on different
processor cores or computers, then we say that they are parallel flows, that they
are running in parallel, and have parallel execution.

Linux provides a clever mechanism, called the /proc filesystem, that allows
user mode processes to access the contents of kernel data structures. The /proc
filesystem exports the contents of many kernel data structures as a hierarchy of text
files that can be read by user programs. For example, you can use the /proc filesystem
to find out general system attributes such as CPU type (/proc/cpuinfo), or
the memory segments used by a particular process (/proc/<process id>/maps).

8.2.5 Context Switches
The operating system kernel implements multitasking using a higher-level form
of exceptional control flow known as a context switch. The context switch mechanism
is built on top of the lower-level exception mechanism that we discussed in
Section 8.1.
The kernel maintains a context for each process. The context is the state
that the kernel needs to restart a preempted process. It consists of the values
of objects such as the general purpose registers, the floating-point registers, the
program counter, user’s stack, status registers, kernel’s stack, and various kernel
data structures such as a page table that characterizes the address space, a process
table that contains information about the current process, and a file table that
contains information about the files that the process has opened.
At certain points during the execution of a process, the kernel can decide
to preempt the current process and restart a previously preempted process. This
decision is known as scheduling, and is handled by code in the kernel called the
scheduler. When the kernel selects a new process to run, we say that the kernel
has scheduled that process. After the kernel has scheduled a new process to run,
it preempts the current process and transfers control to the new process using
a mechanism called a context switch that (1) saves the context of the current
process, (2) restores the saved context of some previously preempted process, and
(3) passes control to this newly restored process.
Acontext switch can occur while the kernel is executing a system call on behalf
of the user. If the system call blocks because it is waiting for some event to occur,
then the kernel can put the current process to sleep and switch to another process.
For example, if a read system call requires a disk access, the kernel can opt to
perform a context switch and run another process instead of waiting for the data
to arrive from the disk. Another example is the sleep system call, which is an
explicit request to put the calling process to sleep. In general, even if a system
call does not block, the kernel can decide to perform a context switch rather than
return control to the calling process.
A context switch can also occur as a result of an interrupt. For example, all
systems have some mechanism for generating periodic timer interrupts, typically
every 1ms or 10ms. Each time a timer interrupt occurs, the kernel can decide that
the current process has run long enough and switch to a new process.\

下面看第九章
2012.7.31

Virtual memory: Virtual memory is an elegant interaction of hardware exceptions, hardware address
translation, main memory, disk files, and kernel software that provides each
process with a large, uniform, and private address space.

9.3.2 As with any cache, the VM system must have some way to determine if a virtual
page is cached somewhere in DRAM. If so, the system must determine which
physical page it is cached in. If there is a miss, the system must determine where
the virtual page is stored on disk, select a victim page in physical memory, and
copy the virtual page from disk to DRAM, replacing the victim page.
These capabilities are provided by a combination of operating system software,
address translation hardware in theMMU(memory management unit), and
a data structure stored in physical memory known as a page table that maps virtual
pages to physical pages.The address translation hardware reads the page table
each time it converts a virtual address to a physical address. The operating system
is responsible for maintaining the contents of the page table and transferring pages
back and forth between disk and DRAM.
第10-12章略过!
2012.8.2