Chapter 4: Processor Architecture. This chapter covers basic combinational and
sequential logic elements, and then shows how these elements can be combined in a datapath that executes a simplified subset of the x86-64 instruction
set called “Y86-64.” We begin with the design of a single-cycle datapath.
This design is conceptually very simple, but it would not be very fast. We
then introduce pipelining, where the different steps required to process an
instruction are implemented as separate stages. At any given time, each
stage can work on a different instruction. Our five-stage processor pipeline is
much more realistic. The control logic for the processor designs is described
using a simple hardware description language called HCL. Hardware designs written in HCL can be compiled and linked into simulators provided
with the textbook, and they can be used to generate Verilog descriptions
suitable for synthesis into working hardware.
Chapter 5: Optimizing Program Performance. This chapter introduces a number
of techniques for improving code performance, with the idea being that programmers learn to write their C code in such a way that a compiler can then
generate efficient machine code. We start with transformations that reduce
the work to be done by a program and hence should be standard practice
when writing any program for any machine. We then progress to transformations that enhance the degree of instruction-level parallelism in the
generated machine code, thereby improving their performance on modern
“superscalar” processors. To motivate these transformations, we introduce
a simple operational model of how modern out-of-order processors work,
and show how to measure the potential performance of a program in terms
of the critical paths through a graphical representation of a program. You
will be surprised how much you can speed up a program by simple transformations of the C code.
Bryant & O’Hallaron fourth pages 2015/1/28 12:22 p. xxiii (front) Windfall Software, PCA ZzTEX 16.2
xxiv Preface
Chapter 6: The Memory Hierarchy. The memory system is one of the most visible
parts of a computer system to application programmers. To this point, you
have relied on a conceptual model of the memory system as a linear array
with uniform access times. In practice, a memory system is a hierarchy of
storage devices with different capacities, costs, and access times. We cover
the different types of RAM and ROM memories and the geometry and
organization of magnetic-disk and solid state drives. We describe how these
storage devices are arranged in a hierarchy. We show how this hierarchy is
made possible by locality of reference. We make these ideas concrete by
introducing a unique view of a memory system as a “memory mountain”
with ridges of temporal locality and slopes of spatial locality. Finally, we
show you how to improve the performance of application programs by
improving their temporal and spatial locality.
Chapter 7: Linking. This chapter covers both static and dynamic linking, including
the ideas of relocatable and executable object files, symbol resolution, relocation, static libraries, shared object libraries, position-independent code,
and library interpositioning. Linking is not covered in most systems texts,
but we cover it for two reasons. First, some of the most confusing errors that
programmers can encounter are related to glitches during linking, especially
for large software packages. Second, the object files produced by linkers are
tied to concepts such as loading, virtual memory, and memory mapping.
Chapter 8: Exceptional Control Flow. In this part of the presentation, we step
beyond the single-program model by introducing the general concept of
exceptional control flow (i.e., changes in control flow that are outside the
normal branches and procedure calls). We cover examples of exceptional
control flow that exist at all levels of the system, from low-level hardware exceptions and interrupts, to context switches between concurrent processes,
to abrupt changes in control flow caused by the receipt of Linux signals, to
the nonlocal jumps in C that break the stack discipline.
This is the part of the book where we introduce the fundamental idea
of a process, an abstraction of an executing program. You will learn how
processes work and how they can be created and manipulated from application programs. We show how application programmers can make use of
multiple processes via Linux system calls. When you finish this chapter, you
will be able to write a simple Linux shell with job control. It is also your first
introduction to the nondeterministic behavior that arises with concurrent
program execution.
Chapter 9: Virtual Memory. Our presentation of the virtual memory system seeks
to give some understanding of how it works and its characteristics. We want
you to know how it is that the different simultaneous processes can each use
an identical range of addresses, sharing some pages but having individual
copies of others. We also cover issues involved in managing and manipulating virtual memory. In particular, we cover the operation of storage
allocators such as the standard-library malloc and free operations. CovBryant & O’Hallaron fourth pages 2015/1/28 12:22 p. xxiv (front) Windfall Software, PCA ZzTEX 16.2
Preface xxv
ering this material serves several purposes. It reinforces the concept that
the virtual memory space is just an array of bytes that the program can
subdivide into different storage units. It helps you understand the effects
of programs containing memory referencing errors such as storage leaks
and invalid pointer references. Finally, many application programmers write
their own storage allocators optimized toward the needs and characteristics of the application. This chapter, more than any other, demonstrates the
benefit of covering both the hardware and the software aspects of computer
systems in a unified way. Traditional computer architecture and operating
systems texts present only part of the virtual memory story.
Chapter 10: System-Level I/O. We cover the basic concepts of Unix I/O such as
files and descriptors. We describe how files are shared, how I/O redirection
works, and how to access file metadata. We also develop a robust buffered
I/O package that deals correctly with a curious behavior known as short
counts, where the library function reads only part of the input data. We
cover the C standard I/O library and its relationship to Linux I/O, focusing
on limitations of standard I/O that make it unsuitable for network programming. In general, the topics covered in this chapter are building blocks for
the next two chapters on network and concurrent programming.
Chapter 11: Network Programming. Networks are interesting I/O devices to program, tying together many of the ideas that we study earlier in the text, such
as processes, signals, byte ordering, memory mapping, and dynamic storage
allocation. Network programs also provide a compelling context for concurrency, which is the topic of the next chapter. This chapter is a thin slice
through network programming that gets you to the point where you can
write a simple Web server. We cover the client-server model that underlies
all network applications. We present a programmer’s view of the Internet
and show how to write Internet clients and servers using the sockets interface. Finally, we introduce HTTP and develop a simple iterative Web server.
Chapter 12: Concurrent Programming. This chapter introduces concurrent programming using Internet server design as the running motivational example.
We compare and contrast the three basic mechanisms for writing concurrent programs—processes, I/O multiplexing, and threads—and show how
to use them to build concurrent Internet servers. We cover basic principles
of synchronization using P and V semaphore operations, thread safety and
reentrancy, race conditions, and deadlocks. Writing concurrent code is essential for most server applications. We also describe the use of thread-level
programming to express parallelism in an application program, enabling
faster execution on multi-core processors. Getting all of the cores working
on a single computational problem requires a careful coordination of the
concurrent threads, both for correctness and to achieve high performance翻译以上英文为中文
最新发布