以前匆匆看过,现在回头来补一补基础知识。还是以视频或者知乎上的课程笔记翻译 为基础。最好也做下作业吧。
以及吴恩达的机器学习点击打开链接
今天先储备下基础,点击打开链接 python我也用了大半年了,但还是用到哪查到哪
Note:
Python中没有 x--,x++ 操作
布尔型:Python实现了所有的布尔逻辑,但用的是英语,而不是我们习惯的操作符(比如&&和||等)。
t = True
f = False
print type(t) # Prints "<type 'bool'>"
print t and f # Logical AND; prints "False"
print t or f # Logical OR; prints "True"
print not t # Logical NOT; prints "False"
print t != f # Logical XOR; prints "True"
hw12 = '%s %s %d' % (hello, world, 12) # sprintf style string formatting
字符串对象:
s = "hello"
print s.capitalize() # Capitalize a string; prints "Hello" 首字母大写
print s.upper() # Convert a string to uppercase; prints "HELLO" 全部大写
print s.rjust(7) # Right-justify a string, padding with spaces; prints " hello" 右对齐补空格
print s.center(7) # Center a string, padding with spaces; prints " hello " 中间对齐 补空格
print s.replace('l', '(ell)') # Replace all instances of one substring with another;
# prints "he(ell)(ell)o" 字符替换
print ' world '.strip() # Strip leading and trailing whitespace; prints "world"
如果想要在循环体内访问每个元素的指针,可以使用内置的enumerate函数
animals = ['cat', 'dog', 'monkey']
for idx, animal in enumerate(animals):
print '#%d: %s' % (idx + 1, animal)
# Prints "#1: cat", "#2: dog", "#3: monkey", each on its own line
使用 列表推导 简化代码
nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]
print squares # Prints [0, 1, 4, 9, 16]
包含条件的列表推导,条件在后边
nums = [0, 1, 2, 3, 4]
even_squares = [x ** 2 for x in nums if x % 2 == 0]
print even_squares # Prints "[0, 4, 16]"
循环Loops:在字典中,用键来迭代更加容易。
d = {'person': 2, 'cat': 4, 'spider': 8}
for animal in d:
legs = d[animal]
print 'A %s has %d legs' % (animal, legs)
# Prints "A person has 2 legs", "A spider has 8 legs", "A cat has 4 legs"
如果你想要访问键和对应的值,那就使用iteritems方法:
d = {'person': 2, 'cat': 4, 'spider': 8}
for animal, legs in d.iteritems():
print 'A %s has %d legs' % (animal, legs)
# Prints "A person has 2 legs", "A spider has 8 legs", "A cat has 4 legs"
字典推导Dictionary comprehensions:和列表推导类似,但是允许你方便地构建字典。
nums = [0, 1, 2, 3, 4]
even_num_to_square = {x: x ** 2 for x in nums if x % 2 == 0}
print even_num_to_square # Prints "{0: 0, 2: 4, 4: 16}"
集合:集合是独立不同个体的无序集合
循环Loops:在集合中循环的语法和在列表中一样,但是集合是无序的,所以你在访问集合的元素的时候,不能做关于顺序的假设。
animals = {'cat', 'dog', 'fish'}
for idx, animal in enumerate(animals):
print '#%d: %s' % (idx + 1, animal)
# Prints "#1: fish", "#2: dog", "#3: cat" 无序的
集合推导Set comprehensions:和字典推导一样,可以很方便地构建集合:
from math import sqrt
nums = {int(sqrt(x)) for x in range(30)}
print nums # Prints "set([0, 1, 2, 3, 4, 5])"
元组Tuples
元组是一个值的 有序列表(不可改变)。从很多方面来说,元组和列表都很相似。和列表最重要的不同在于,元 组可以在字典中用作键,还可以作为集合的元素,而列表不行。例子如下:d = {(x, x + 1): x for x in range(10)} # Create a dictionary with tuple keys
print d
t = (5, 6) # Create a tuple
print type(t) # Prints "<type 'tuple'>"
print d[t] # Prints "5"
print d[(1, 2)] # Prints "1"
Numpy
创建数组 array
b = np.array([[1,2,3],[4,5,6]]) # Create a rank 2 array
整型数组访问:当我们使用切片语法访问数组时,得到的总是原数组的一个子集。整型数组访问允许我们利用其它数组的数据构建一个新的数组: 构建索引矩阵,前边的是行号,后边的是列号
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
# An example of integer array indexing.
# The returned array will have shape (3,) and
print a[[0, 1, 2], [0, 1, 0]] # Prints "[1 4 5]"
# The above example of integer array indexing is equivalent to this:
print np.array([a[0, 0], a[1, 1], a[2, 0]]) # Prints "[1 4 5]"
# When using integer array indexing, you can reuse the same
# element from the source array:
print a[[0, 0], [1, 1]] # Prints "[2 2]"
# Equivalent to the previous integer array indexing example
print np.array([a[0, 1], a[0, 1]]) # Prints "[2 2]"
可以用来选择或者更改矩阵中每行中的一个元素:
import numpy as np
# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
print a # prints "array([[ 1, 2, 3],
# [ 4, 5, 6],
# [ 7, 8, 9],
# [10, 11, 12]])"
# Create an array of indices
b = np.array([0, 2, 0, 1])
# Select one element from each row of a using the indices in b
print a[np.arange(4), b] # Prints "[ 1 6 7 11]"
# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 10
print a # prints "array([[11, 2, 3],
# [ 4, 5, 16],
# [17, 8, 9],
# [10, 21, 12]])
布尔型数组访问:布尔型数组访问可以让你选择数组中任意元素。通常,这种访问方式用于选取数组中满足某些条件的元素,举例如下:数据筛选,bool Mask矩阵,元矩阵[Mask]
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]],dtype =float32)
bool_idx = (a > 2) # Find the elements of a that are bigger than 2;
# this returns a numpy array of Booleans of the same
# shape as a, where each slot of bool_idx tells
# whether that element of a is > 2.
print bool_idx # Prints "[[False False]
# [ True True]
# [ True True]]"
# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print a[bool_idx] # Prints "[3 4 5 6]"
# We can do all of the above in a single concise statement:
print a[a > 2] # Prints "[3 4 5 6]"
数组运算
print x / y
print np.divide(x, y)
和MATLAB不同,*是元素逐个相乘,而不是矩阵乘法。在Numpy中使用dot来进行矩阵乘法
广播Broadcasting
广播是一种强有力的机制,它让Numpy可以让不同大小的矩阵在一起进行数学计算。我们常常会有一个小的矩阵和一个大的矩阵,然后我们会需要用小的矩阵对大的矩阵做一些计算。
举个例子,如果我们想要把一个向量加到矩阵的每一行,我们可以这样做:
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x) # Create an empty matrix with the same shape as x
# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
y[i, :] = x[i, :] + v
# Now y is the following
# [[ 2 2 4]
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]
print y
这样是行得通的,但是当x矩阵非常大,利用循环来计算就会变得很慢很慢。我们可以换一种思路:
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
vv = np.tile(v, (4, 1)) # Stack 4 copies of v on top of each other 把V 复制成[4,1]份,四行一列都是v
print vv # Prints "[[1 0 1]
# [1 0 1]
# [1 0 1]
# [1 0 1]]"
y = x + vv # Add x and vv elementwise
print y # Prints "[[ 2 2 4
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]"
Numpy广播机制可以让我们不用创建vv,就能直接运算,看看下面例子:
import numpy as np
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v # Add v to each row of x using broadcasting
print y # Prints "[[ 2 2 4]
# [ 5 5 7]
# [ 8 8 10]
# [11 11 13]]"
对两个数组使用广播机制要遵守下列规则:
- 如果数组的秩不同,使用1来将秩较小的数组进行扩展,直到两个数组的尺寸的长度都一样。
- 如果两个数组在某个维度上的长度是一样的,或者其中一个数组在该维度上长度为1,那么我们就说这两个数组在该维度上是相容的。
- 如果两个数组在所有维度上都是相容的,他们就能使用广播。
- 如果两个输入数组的尺寸不同,那么注意其中较大的那个尺寸。因为广播之后,两个数组的尺寸将和那个较大的尺寸一样。
- 在任何一个维度上,如果一个数组的长度为1,另一个数组长度大于1,那么在该维度上,就好像是对第一个数组进行了复制。
如果上述解释看不明白,可以读一读文档和这个解释。译者注:强烈推荐阅读文档中的例子。
支持广播机制的函数是全局函数。哪些是全局函数可以在文档中查找。
下面是一些广播机制的使用:
import numpy as np
# Compute outer product of vectors
v = np.array([1,2,3]) # v has shape (3,)
w = np.array([4,5]) # w has shape (2,)
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:
# [[ 4 5]
# [ 8 10]
# [12 15]]
print np.reshape(v, (3, 1)) * w
# Add a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:
# [[2 4 6]
# [5 7 9]]
print x + v
# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column. Gives the following matrix:
# [[ 5 6 7]
# [ 9 10 11]]
print (x.T + w).T
# Another solution is to reshape w to be a row vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
print x + np.reshape(w, (2, 1))
# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
# [[ 2 4 6]
# [ 8 10 12]]
print x * 2
SciPy
图像操作
SciPy提供了一些操作图像的基本函数。比如,它提供了将图像从硬盘读入到数组的函数,也提供了将数组中数据写入的硬盘成为图像的函数。
from scipy.misc import imread, imsave, imresize
# Read an JPEG image into a numpy array
img = imread('assets/cat.jpg')
print img.dtype, img.shape # Prints "uint8 (400, 248, 3)"
# We can tint the image by scaling each of the color channels
# by a different scalar constant. The image has shape (400, 248, 3);
# we multiply it by the array [1, 0.95, 0.9] of shape (3,);
# numpy broadcasting means that this leaves the red channel unchanged,
# and multiplies the green and blue channels by 0.95 and 0.9
# respectively.
img_tinted = img * [1, 0.95, 0.9]
# Resize the tinted image to be 300 by 300 pixels.
img_tinted = imresize(img_tinted, (300, 300))
# Write the tinted image back to disk
imsave('assets/cat_tinted.jpg', img_tinted)
MATLAB文件
函数scipy.io.loadmat和scipy.io.savemat能够让你读和写MATLAB文件。具体请查看文档。
Matplotlib
Matplotlib是一个作图库。这里简要介绍matplotlib.pyplot模块,功能和MATLAB的作图功能类似。
绘图
matplotlib库中最重要的函数是Plot。该函数允许你做出2D图形,如下:
import numpy as np
import matplotlib.pyplot as plt
# Compute the x and y coordinates for points on a sine curve
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)
# Plot the points using matplotlib
plt.plot(x, y)
plt.show() # You must call plt.show() to make graphics appear.
运行上面代码会产生下面的作图:
—————————————————————————————————————————

只需要少量工作,就可以一次画不同的线,加上标签,坐标轴标志等。
import numpy as np
import matplotlib.pyplot as plt
# Compute the x and y coordinates for points on sine and cosine curves
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)
# Plot the points using matplotlib
plt.plot(x, y_sin)
plt.plot(x, y_cos)
plt.xlabel('x axis label')
plt.ylabel('y axis label')
plt.title('Sine and Cosine')
plt.legend(['Sine', 'Cosine'])
plt.show()
—————————————————————————————————————————

可以在文档中阅读更多关于plot的内容。
绘制多个图像
可以使用subplot函数来在一幅图中画不同的东西:
import numpy as np
import matplotlib.pyplot as plt
# Compute the x and y coordinates for points on sine and cosine curves
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)
# Set up a subplot grid that has height 2 and width 1,
# and set the first such subplot as active.
plt.subplot(2, 1, 1)
# Make the first plot
plt.plot(x, y_sin)
plt.title('Sine')
# Set the second subplot as active, and make the second plot.
plt.subplot(2, 1, 2)
plt.plot(x, y_cos)
plt.title('Cosine')
# Show the figure.
plt.show()
—————————————————————————————————————————

关于subplot的更多细节,可以阅读文档。