2.2 Data Abstraction
As we consider the wide set of things in the world that we would like to represent in our programs, we find that most of them have compound structure.
2.2.1 Example: Rational Numbers
A rational number is a ratio of integers.<numerator>/<denominator>
Actually dividing integers produces a float
approximation, losing the exact precision of integers.
>>> 1/3
0.3333333333333333
>>> 1/3 == 0.33333333333333330000
True
2.2.2 Pairs
Python provides a compound structure called a list
, which can be constructed by placing expressions within square brackets separated by commas.
>>> [10, 20]
[10, 20]
The elements of a list can be accessed in two ways.
>>> pair = [10, 20]
>>> pair[0]
10
>>> x, y = pair
>>> y
20
The equivalent function for the element selection operator is called get item
>>> from operator import getitem
>>> getitem(pair, 0)
10
HW04 Mobile
math.inf
相当于float('inf')
的输出,表示正无穷大的浮点数
>>> import math
>>> print(math.inf)
inf
>>> print(-math.inf)
-inf
>>> print(float('inf')) # float('inf') is Python's version of infinity.
inf
>>> print(float('-inf')) # float('inf') is Python's version of infinity.
-inf
2.2.3 Abstracton Barriers
An abstraction barrier violation occurs whenever a part of the program that can use a higher level function instead uses a function in a lower level.
Abstraction barriers make programs easier to maintain and to modify. The square_rational
function would not require updating even if we altered the representation of rational numbers.
>>> def square_rational(x):
return mul_rational(x, x)
Referring directly to numerators and denominators would violate one absraction barrier. The square_rational_violating_once
would need to be changed whenever the selector or constructor signatures changed.
def square_rational_violating_once(x):
return rational(numer(x) * numer(x), denom(x) * denom(x))
Assuming that rationals are represented as two-element lists would violate two abstraction barriers. It would require updating whenever the implementation of rational numbers changed.
def square_rational_violating_twice(x):
return [x[0] * x[0], x[1] * x[1]]
example: HW04 Q2
2.2.4 The Properties of Data
Stated as a behavior condition, if a pair p
was constructed from values x
and y
, then select(p, 0)
returns x
, and select(p, 1)
returns y
. We don’t actually need the list
type to creat pairs. Instead, we can implement two functions pair
and select
that fulfill this description just as well as a two-element list.(Exhibiting the functional representation of a pair)
>>> def pair(x, y):
def get(index):
if index == 0:
return x
elif index == 1:
return y
return get
>>> def select (p, i):
return p(i)
>>> p = pair(10, 20)
>>> select(p, 0)
20
2.3 Sequences
A sequence is an ordered collection of values. Sequence are not instances of a particular built-in type or abstract data represention, but instead a collection of behaviors that shared among several different types of data.
Length.
Elements.
2.3.1 List
A list value is a sequence that can have arbitrary length.
The built-in len
function returns the length of a sequence. len(list_name)
The built-in digits
function returns the xth element. digits[index]
Lists can be added together and multiplied by integers. Sequences combine and replicate themselves.
Any values can be included in a list, including another list. Element selection can be applied multiple times in order to select a deeply nested element in a list containing lists. list[x][y]
2.3.2 Sequence Iteration
A for
statement consists of a single clause with the form:
for <name> in <expression>:
<suite>
Some common sequence pattern
Sequence unpacking. A common pattern in programs is to have a sequence of elements that are themselves sequences, but all of a fixed length.
Ranges. A range
is another built-in type of sequence in Python, which represents a range of integers. Ranges are created with range
, which takes two integer arguments: the first number and one beyond the last number in the desired range.
>>> range(1, 10)
range(1, 10)
>>> list(range(range(5,8))
[5, 6, 7]
>>> list(range(4))
[0, 1, 2, 3]
Ranges commonly appear as the expression in a for
header to specify the number of times that the suite should be executed: A common convention is to use a single underscore character for the name in the for
header if the name is unused in the suite:
>>> for _ in range(2):
print('Go Bears!')
Go Bears!
Go Bears!
2.3.3 Sequence Processing
The Python for
statement can simplify function by iterating over the element values directly, while a while
loop introduce the name index
.
List Comprehensions.
The general form of a list comprehension is:
[<map expression> for <name> in <sequence expression> if <filter expression>]
The for
keyword above is not part of a for
statement, but instead part of a list comprehension because it is contained within square brackets.
Aggregation. Another common pattern in sequence processing is to aggregate all values in a sequence into a single value.
Higher-Order Functions.
Many forms of aggregation can be expressed as repeatedly applying a two-argument function to the reduced
value so far and each element in turn.
>>> def reduce(reduce_fn, s, initial):
reduced = initial
for x in s:
reduced = reduce_fn(initial, x)
return reduced
>>> reduce(mul, [2, 4, 8], 1)
64
Conventional Names. In the computer science community, the more common name for apply_to_all
is map
and the more common name for keep_if
is filter
.
In Python programs, it is more common to use list comprehensions directly rather than higher-order functions, but both approaches to sequence processing are widely used.
2.3.4 Sequence Abstraction
Membership. A value can be tested for membership in a sequence.
>>> digits
[1, 8, 2, 8]
>>> 2 in digits
True
>>> 1828 not in digits
True
Slicing. A slice of a sequence is any contiguous span of the original sequence, designated by a pair of integers.
2.3.5 Strings
String literals can express arbitrary text, surrounded by either single or double quotation marks.
Membership. It matches substrings rather than elements.
>>> 'here' in "Where's Waldo?"
True
Multiline Literals. Triple quotes delimit string literals that span multiple lines.
>>> """The Zen of Python
claims, Readability counts.
Read more: import this."""
'The Zen of Python\nclaims, "Readability counts."\nRead more: import this.'
The \n (pronounced “backslash en”) is a single element that represents a new line. Although it appears as two characters (backslash and “n”), it is considered a single character for the purposes of length and element selection.
String Coercion. A string can be created from any object in Python by calling the str constructor function with an object value as its argument.
>>> str(2) + ' is an element of ' + str(digits)
'2 is an element of [1, 8, 2, 8]'
2.3.6 Tree
The data abstraction for a tree consists of the constructor tree
and the selectors label
and branches
.
Recursive description (wooden tree): A tree has a root label and a sequence of branches. Each branch of a tree is a tree. A tree with no branches is called a leaf. The root of each sub-tree of a tree is called a node in that tree.
Relative description (family trees): Each location in a tree is called a node. Each node has a label that can be any value. One node can be the parent/child of another.
People often refer to labels by their locations:" each parent is the sum of its children"
def tree(root_label, branches=[]):
for branch in branches: # add some checks
assert is_tree(branch), 'branches must be trees'
return [root_label] + list(branches)
def label(tree):
return tree[0]
def branches(tree):
return tree[1:]
A tree is well-formed only if it has a root label and all branches are also trees.
def is_tree(tree):
if type(tree) != list or len(tree) < 1:
return False
for branch in branches(tree):
if not is_tree(branch):
return False
return True
The is_leaf
function checks whether or not a tree has branches.
def is_leaf(tree):
return not branches(tree)
>>> t = tree(3, [tree(1), tree(2, [tree(1), tree(1)])])
>>> t
[3, [1], [2, [1], [1]]]
>>> label(t)
3
>>> branches(t)
[[1], [2, [1], [1]]]
>>> label(branches(t)[1])
2
>>> is_leaf(t)
False
>>> is_leaf(branches(t)[0])
True
Function that take trees as input, or return trees as output are often tree recursive themselves. In practice, one does not often create trees using the tree constructor and a set of explicit labels but instead generates the tree programmatically. Tree-recursive functions can be used to construct trees.
>>> fib_tree(n):
if n <= 1:
return tree(n) # Creating the leaf by using tree constructor.
else:
left, right = fib_tree(n-2), fib_tree(n-1) # The left branch and the right branch are both going to be constructed with recursive calls to fib_tree. Why【n-2/n-1】? 1) The sequence of Fibonacci numbers, in which each number is the sum of the preceding two; 2) People often refer to labels by thier locations:"each parent is the sum of its children"
return tree(label(left) + label(right), [left, right])
Tree-recursive functions are also used to process trees.
def count_leaves(t):
if is_leaf(t):
return 1
else:
branch_counts = [count_leaves(b) for b in branches(t)] # Taking each branch in the branches of the tree and count the leaves in each branch.
return sum(branch_counts)
Implement leaves
, which returns a list of the leaf labels of a tree
Hint: If you sum a list, you get a list containing the elements of those lists.
>>> sum([ [1], [2, 3], [4] ], []) # Providing a starting value of an empty list
[1, 2, 3, 4]
>>> sum([ [[1]], [2] ], [])
[[1], 2] # Sum doesn't remove all of the nested structure, it just gets rid of one level
def leaves(tree):
"""Return a list containing the leaf labels of tree.
>>> leaves(fib_tree(5))
[1, 0, 1, 0, 1, 1, 0, 1]
"""
if is_leaf(tree):
return [label(tree)]
else:
return sum([leaves(b) for b in branches(tree)], []) # Getting a list of labels for each branch
A function that creates a tree from another tree is typically also recursive.
def increment_leaves(t):
"""Return a tree like t but with leaf labels incremented."""
if is_leaf(t):
return tree(label(t) + 1)
else:
bs = [increment_leaves(b) for b in branches(t)]
return tree(label(t), bs)
def increment(t):
"""Return a tree like t but with all labels incremented"""
return tree(label(t) + 1, [increment(b) for b in branches(t)]) # Incrementing all of the labels in all of the branches
Example: Printing Trees
def print_tree(t):
print(label(t)) # We don't have to look at nested lists all the time.
for b in branches(t):
print_tree(b) # We can't see the structure
def print_tree(t, indent=0): # The root label isn't indented at all.
print(' ' * indent + str(label(t))) # Taking two spaces, multiply them by indentation level, and then add a string representation of the label of t.
for b in branches(t):
print_tree(b, indent+1) # Since this is defined in terms of indent, if we get a branch of a branch, it will be indented twice.