Chapter 3 Lists, Stacks, and Queues
Topics:
*Introduce the concept of Abstract Data Types(ADTs).
*Show how to efficiently perform operations on lists.
*Introduce the stack ADT and its use in implementing recursion.
*Introduce the queue ADT and its use in operating systems and algorithm design.
3.1 Abstract Data Types(ADTs)
An abstract data type(ADT) is a set of objects together with a set of operations. Abstract data types are mathematical abstractions; nowhere in a ADT's definition is there any mention of how the set of operations is implemented. Objects such as lists, sets, and graphs, along with there operations , can be viewed as ADTs, just as integers, reals, and booleans are data types. Integers, reals, and booleans have operations associated with them, an so do ADTs. For the set ADT, we might have such operations as add, remove, size, and contains. Alternatively, we might only want the two operations union and find, which would define a different ADT on the set.
The C++ class allows the implementation of ADTs, and hiding the implementations details.
3.2 The List ADT
3.2.1 Simple Array Implementation of Lists
vector class internally stores an array, and allow to grow by doubling its capacity when needed.
An array implementations allows printList to be carried out in linear time, and the find Kth operation takes constant time,
But the insertion and remove operations are potentially expensive, depending on where the insertions and deletions occur.(worst case is O(N) such insert element to first position, or delete element from first position) On everage ,half of the list needs to be moved for either operation. So linear time is required.
If all of the operation occurs at the end of the array ,then the time is O(1).
If the insert and delete only occur at the end of the array, it should be OK to use array. But if the insertion and deletion must occur throughout the whole array, choose linklist.
3.2.2 Simple Linked Lists
To avoid the linear cost of insertion and deletion, we need to ensure that the list is not stored contiguously, otherwise the entire parts of the list will need to be moved. refer Figure3.1
The linked list consists of a series of nodes(which is not required adjacent in memory). Each nodes contain the element and a link to a node containing its successor.(next link). The last cell's next link points to nullptr.
To execute printList() or find(x), start at the first node and traverse the list by following the next links. it taks linear-time. But some tims is O(i) i is the position of the node.
The remove method can be executed in one next pointer change. Refer Figure 3.2
The insert method requires obtaining a new node from system by using a new call and then executing two next pointer manuevers. Refer figure 3.3
Assume we known where a change is to be made, inserting or removing an item from a linked list involves only a const number of changes to node links.
The special case of adding to the front or removing the first item is thus a constant-time operation.
The special case of adding at the end can be constant-time. find the next-to-last item, change its next link to nullptr, and then update the link that maintains the last node.(a link to its previous node double linked list ref figure 3.4)
3.4 vector and list in STL
C++ provide a Standard Template Library(STL), provide some implementations of ADT. this data structures are called collections or containers.
vector provides a growable array implementation of the List ADT. it is indexable in constant time. the disadvantage is the insertion and remove operations is expensive.
list provides a doubly linked list implementation of the List ADT. the advantage is the insertion and remove operations required constant-time, the disadvantage is that the list is not easily indexable.
three common methods for STL containers:
int size() const; // return the number of elements in the container.
void clear(); // remove all elements from the container.
bool empty() const; //return true if the container contains no elements, and the false otherwise.
both vector and list support adding and removing from the end of the list ADT in constant time.
and both vector and list support accessing the front item in the list in constant time.
void push_back(const Object&x); // add x to the end of the list.
void pop_back(); // removes the object at the end of the list.
const Object & back() const: //return the object at the end of the list. (support mutator overload)
const Object & front() const; // return the object at the front of the list. (support mutator overload)
because double linked list support efficient changes at the front, but vector does not, the following two method are available only for list;
void push_front(const Object & x); // add x to the front of the list.
void pop_front(); // removes the object at the front of the list.
the vector has its own set of methods that are not part of list. Such as efficient indexing. and change the internal capacity.
Object & operator[] (int idx; //return the object at index idx in the vector, without bounds-checking
Object & at(int idx); //return the object at index idx ,with bounds-checking.
int capacity() const; //return the internal capacity of the vector.
void reverse(int newCapacity); //set the new capacity. if a good estimate is available, it can be used to avoid expansion of the vector.
3.3.1 Iterators
Some operations on lists such as insert and remove from the middle of the list. require the notion of a position. In STL, a position is represented by nested type, iterator. Such as list<string> the position is presented by list<string>::iterator. for list<int> ,the position is list<int>::iterator
methods about iterator.
Getting an iterator
iterator begin(); // return an appropriate iterator representing the first item in the container.
iterator end(); // returns an appropriate iterator representing the endmarker int the container(i.e, the position after the last item in the container).
Iterator Methods
!=, == for compare, return true or false.
= copy constructor, assignment operator
++, advances the iterator itr to the next location. prefix and postfix forms are allowavle.
*itr; return a reference to the object stored at iterator its's location.(the reference returned may or may not be modifiable)
Container Operations That Require Iterators
3.3.2 Example: Using erase on a List
/**
* remove every other item from list ADT
*/
template <typename Container>
void removeEveryOtherItem(Container & lst)
{
auto itr = lst.begin(); // itr is a Container::iterator
while(itr != lst.end())
{
itr = lst.erase(itr);
if(itr != lst.end())
++itr;
}
}
int std::list cost linear-time, but int std::vector it will cost quadratic time.
if we run erase for list<int> 800,000-item list, take 0.039 sec. and 0.073 sec for 1,600,000-item list.
for vector<int> of 800,000-item it takes five minutes. and tewnty minites for 1,600,000 items.
3.3.3 const_iterators
*itr return the reference to the element from the container. ref the routine below:
template<typename Container, typename Object>
void change(Container &c, const Object & newValue)
{
typename Container::iterator itr = c.begin();
while(itr != c.end())
*itr++ = newValue;
}
if we just use some reference to pass the Container for parameter, the normal iterator will not work.(because it is mutable)
there is a const_iterator nested type. operator * for const_iterator return a constant reference. so the value can't be changed.
And C++ compiler will force you to use const_iterator to traverse a constant collection.
iterator begin()
const_interator begin() const
iterator end()
const_iterator end() const
the two versions of begin can be in the same class only because the const-ness of a method is considered to be part of the signature.
if "begin/end" is invoked on a nonconstant container, the "mutator" version that returns an iterator is invoked.
Otherwise, an const_iterator returned. (*itr = 0 is illegal for const_iterator)
std::vector and std::list are supported by range-loop of C++11
If a container doesn't have the begin and end member functions. Non-member free function begin and end are defined that allow one to use begin(c) in any place where c.begin() is allowed. It allows the generic code to work on containers that have begin/end as members, as well as those that do not have bein/end but which can later be augmented with appropriated non-member functions.
The begin and end as free functions in C++11 is made possible by the addition of language features auto and decltype, as shown in the code below.
And implementation for generic template routine by using the non-member function begin and end.
and return type of auto.
template <typename Container>
void print(const Container &c, std::ostream & out = std::cout)
{
if(c.empty())
out << "(empty)";
else
{
auto itr = begin(c); // itr is a Container::const_iterator
out << "[" << *itr++; // print first item
while(itr != end(c))
out << ", " << *itr++;
out << " ]" << std::endl;
}
}
3.4 Implementation of vector
Some important features of C++ primitive arrays:
To avoid ambiguities with the library class, we will name out class templateVector. Before examing the Vector code, we outline the main details:
we din't handle some error signal ,just use the c/c++ pointer semantics for the iterator and const_iterator.
for more specific detail about the comment of the code refer the textbook.
Refer the finall implementation of the Vector class:
https://github.com/sesiria/Algs/blob/master/Lib/Vector.h
3.5 Implementation of list
Implementation of the template class List.
Sometimes create extra node at the head and the end of the linklist called sentinel nodes(哨兵节点). The node at the front is called header node, and the end is called tail node.
Figure 3.17 illustrates how a new node containing x is spliced in between a node pointed at by p and p.prev. The assignment to the node pointers can be described as follows:
Node *newNode = newNode{x, p->prev, p}; // step 1 and 2
p->prev->next = newNode; // step3
p->prev = newNode; // step4
step 2 and 3 can be combined. to obain:
Node *newNode = new Node{x, p->prev, p}; // step 1 and 2
p->prev = p->prev->next = newNode; // step 3 and 4
these two line can also to combine to:
p->prev = p->prev->next = new Node{x, p->prev, p};
The code for removing a node from linklist.
p->prev->next = p->next;
p->next->prev = p->prev;
delete p;
Finally the implementation of the List class refer:
https://github.com/sesiria/Algs/blob/master/Lib/List.h
3.6 The Stack ADT
A stack is a list with the restriction that insertions and deletions can be performed in only one position, namely, the end of the list, called the top.
3.6.1 Stack Model
The basic method for Stack ADT is push (equal to insert an element at the top)
and pop(equal to remove the element at the top).
Pop on an empty stack is an error in the stack ADT.
running out of space when performaing a push is an implementation limit but not an ADT error.
Stacks are sometimes known as LIFO(last in, first out) lists. the push is the only input and the pop is the only output.
Other operation to make empty stacks and test for emptiness are part of repertoire, but essentially all that you can do to a stack is push and pop.
3.6.2 Implementation of Stacks
Since a stack is a list ADT, any list ADT implementation will do. Clearly list and vector support stack operations; 99% of the time they are the most reasonable choice.
Linked List Implementation of Stacks
We perform a push by inserting at the front of the list. We perform a pop by deleting the element at the front of the list.
A top operation merely examines the element at the front of the list, returning its value.
Some times the pop and top operations are combined into one.
Array Implementation of Stacks
Use the back, push_back and pop_back implmentation from vector. For push some element x onto the stack, we increment topOfStack and then set theArray[topOfStack] = x.
To pop, we set the return value to theArray[topOfStack] and then decrement topOfStack.
3.6.3 Applications
Some application for the stack ADT.
Balancing Symbols
For syntax check of compilers. such as the balancing of (), [], {} etc.
refer the code:
https://github.com/sesiria/Algs/blob/master/cp3/ex3_21.cpp
Postfix Expressions
Define a notation is known as postfix, or reverse Polish notation. When a number is seen, it is pushed onto the stack; when an operator is seen, the operator is applied to the two numbers(symbols) that are popped from the stack, and the result is pushed onto the stack.
For instance,
The time to evaluate a postfix expression is O(N). because processing each element in the input consists of stack operations and takes const time.
refer:
https://github.com/sesiria/Algs/blob/master/cp3/ex3_27.cpp
Infix to Postfix Conversion
We can use the stack to convert an expression from standard form(otherwise known as infix) into postfix.
We will concentrate on a small version of the general problem by allowing only the operation +, * (). and insisting on the usual procedence rules.
Input a string of expression:
a + b * c + (d * e + f ) * g
the output should be :
a b c * + d e * f + g * +
Algorithm sketch:
1) when a operand is read, it is immediately placed onto the output, Operators are not immediately output, so they must be saved in stack. We will also stack left parenthesis when they are encountered.
2) If we see a right parathesis, then we pop the stack, writing symbols until we encounter a left paranthesis, which is popped but not output.
3) If we see any other symbol +, *, ( then we pop entries from the stack until we find an entry of lower priority.(equal priority will also be popped). But One exception is that never remove a ( from the stack except when processing a ). when popping is done, we push the current operator onto the stack.
4) Finally, if w read the end of input, we pop the stack until it is empty, writing symbols onto the output.
5) When a operator is seen(not a parenthese), we will check the top of the stack. if the top has high priority than the current operator, we just run stragety 3). otherwise, we just push the current operator into the stack.
"prior to placing the operator on the stack, operators that are on the stack, and which are to be completed prior to the current operator, are popped."
This is illustrated in the following table:
6) We can view a left parenthesis as a high-precedence operator when it is an input symbol. and a low-precedence operator when it is on the stack. The right-paretheses are treated as special case.
The whole progress for the input string mentioned above:
First ,the symbol a is read, so it is passed through to the output.
this conversion requires O(N) time and works in one pass through the input.
Refer code:
https://github.com/sesiria/Algs/blob/master/cp3/ex3_23.cpp
Function Calls
When there is a function call, all the important information that needs to be saved, such as register values and the return address(CS:IP). is saved "on a pie of paper" in an abstract way and put at the top of a pile. Then the control is transferred to the new function, which is free to replace the registers with its value. If it makes other function calls, it follow the same procedure. When the function return, it look at the "paper" at the top of the pile and restores all the registers. It then makes the return jump.
And the function calls is base on a stack.
A bad use of recursion: printing a container
/**
* Print container from start up to but not including end.
*/
template <typename Iterator>
void print(Iterator start, Iterator end, std::ostream & out = std::cout)
{
if(start == end)
return;
out << *start++ << std::endl; // Print and advance start
print(start, end, out);
}
This program will run out of stack space when the element is very large. This program is an example of an extremely bad use of recursion known as tail recursion(尾递归).
The tail recursion refers to a recursive call at the last line(but the current state of the stack is not need to restore anymore). Tail recursion can be mechanically eliminated by enclosing the body in a while loop and replacing the recursive call with on assignment per function argument.
A mechanically improved version generated by this algorithm.
/**
* Print container from start up to but not including end.
*/
template <typename Iterator>
void print(Iterator start, Iterator end, std::ostream & out = std::cout)
{
while(true)
{
if(start == end)
return;
std::cout << *start++ << std::endl; // Print and advance start.
}
}
Recursion can always be completely removed(compilers do so in converting to assembly language), but doing so can be quite tedious.
The general strategy requires using a stack and is worthwhile only if you can manage to put the bare minimum on the stack. We will not dwell on this further, except to point out that although nonrecursive programs are certainly generally faster than equivalent recursive programs.
3.7 The Queue ADT
Queues are lists ADT. With a queue, insertion is done at on end whereas deletion is performed at the other end.
3.7.1 Queue Model
The basic operations on a queue are enqueue, which inserts an element and the end of the list(called the rear), and dequeue, which deletes(and returns) the element at the start of the list(known as the front). Refer figure 3.27
3.7.2 Array Implementation of Queues
Both linklist(list) and Array(vector) implenentation of queue is legal. give fast O(1) running times for every operation.
We will discuss the array implementation of queue:
queue implementation for circular array:
https://github.com/sesiria/Algs/blob/master/cp3/ex3_33.cpp
queue implementation for singly-linklist:
https://github.com/sesiria/Algs/blob/master/cp3/ex3_32.cpp
queue implementation for cycle linklist:
https://github.com/sesiria/Algs/blob/master/cp3/ex3_35.cpp
3.7.3 Applications of Queues
Some simple examples of queue usage.
When jobs are submitted to a printer, they are arranged in order of arrival. Thus, essentialy, jobs sent to a printer are placed on a queue.
Virtually every real-life lines is a queue. Lines at ticket counters are queues, because service is first-come first-served.
Another example concerns computer networks. there are many network setups of personal computers in which the disk is attached to one machine, known as the file server. Users on other machines are given access to files on a first-come first-served basis, so the data structure is queue.
A whole branch of mathematics known as queueing theory deals with computing, probablistically, etc.
Problem: how long users expect to wait on a line, how long the line gets, and other such question. The answer depends on how frequently users arrive to the line and how long it takes to process a user once the user is served.
Some complicated case, we can't handle analyticcally, use the queue to simulation the problem. refer Chapter 6.
Summary