施密特正交化
施密特正交化(Gram-Schmidt Orthogonality)是常用的求欧式空间正交基的方法。给定一个线性无关向量组a1,a2,...,ama_1,a_2,...,a_ma1,a2,...,am,经过施密特正交化可以得到一组正交向量组b1,b2,...,bmb_1,b_2,...,b_mb1,b2,...,bm,正交向量组中的每个向量正交化之后,就得到一个标准正交向量组。
下面我们来看施密特正交化的具体步骤。因为向量组a1,a2,...,ama_1,a_2,...,a_ma1,a2,...,am是线性无关的,所以可以用这些向量来构造b1,b2,...,bmb_1,b_2,...,b_mb1,b2,...,bm,具体如下:
b1=a1b_1 = a_1b1=a1
b2=a2+ha1b_2 = a_2 + ha_1b2=a2+ha1
b3=a3+h1a1+h2a2b_3 = a_3 + h_1a_1 + h_2a_2b3=a3+h1a1+h2a2
.........
反过来,我们也可以从b1,b2,...,bmb_1,b_2,...,b_mb1,b2,...,bm退回到a1,a2,...,ama_1,a_2,...,a_ma1,a2,...,am
a1=b1a_1 = b_1a1=b1
a2=b2+kb1a_2 = b_2 + k b_1a2=b2+kb1
a3=b3+k1b1+k2b2a_3 = b_3 + k_1b_1 + k_2b_2a3=b3+k1b1+k2b2
.........
==>==>==>
b1=a1b_1 = a_1b1=a1
b2=a2−kb1b_2 = a_2 - kb_1b2=a2−kb1
b3=a3−k1b1−k2b2b_3 = a_3 - k_1b_1 - k_2b_2b3=a3−k1b1−k2b2
.........
只需要求出k,k1,k2k,k_1,k_2k,k1,k2这些系数,就可以得到a1,a2,...,ama_1,a_2,...,a_ma1,a2,...,am到b1,b2,...,bmb_1,b_2,...,b_mb1,b2,...,bm的转化方式。由于b1,b2,...,bmb_1,b_2,...,b_mb1,b2,...,bm是正交的,所以我们有:
b1Tb2=0⇒b1T(a2−kb1)=0⇒k=b1Ta2b1Tb1b_1^Tb_2=0 ⇒ b_1^T(a_2 - kb_1)=0 ⇒ k=\frac{b_1^Ta_2}{b_1^Tb_1}b1Tb2=0⇒b1T(a2−kb1)=0⇒k=b1Tb1b1Ta2
b1Tb3=0⇒b1T(a3−k1b1−k2b2)=0⇒k1=b1Ta3b1Tb1b_1^Tb_3=0 ⇒ b_1^T(a_3 - k_1b_1 - k_2b_2)=0 ⇒ k_1=\frac{b_1^Ta_3}{b_1^Tb_1}b1Tb3=0⇒b1T(a3−k1b1−k2b2)=0⇒k1=b1Tb1b1Ta3
b2Tb3=0⇒b2T(a3−k1b1−k2b2)=0⇒k2=b2Ta3b2Tb2b_2^Tb_3=0 ⇒ b_2^T(a_3 - k_1b_1 - k_2b_2)=0 ⇒ k_2=\frac{b_2^Ta_3}{b_2^Tb_2}b2Tb3=0⇒b2T(a3−k1b1−k2b2)=0⇒k2=b2Tb2b2Ta3
.........
根据这个公式很容易写出施密特正交化代码:
def schmidt_orthogonality(matrix_org, debug=False):
"""
b1 = a1, b2 = a2 - kb1, b3 = a3 - k1b1 - k2b2
:param matrix_org: m x n matrix, m >= n 且满秩
:return:
"""
m, n = matrix_org.shape
matrix_ortho = matrix_org.copy()
matrix_ortho = np.asarray(matrix_ortho, dtype=np.float)
coefficient = np.zeros(shape=(m, n)) # 系数矩阵k、k1、k2
coefficient[0, 0] = 1 # b1 = a1
for i in range(1, n): # 开始处理下一列
coefficient[i, i] = 1
for j in range(i):
b_j = matrix_ortho[:, j]
k_j = np.dot(b_j, matrix_org[:, i]) / np.dot(b_j, b_j)
coefficient[j, i] = k_j
matrix_ortho[:, i] -= k_j * b_j
# 正交向量b1,b2...做正交化处理,系数也做相应的改变
for i in range(n):
devider = np.dot(matrix_ortho[:, i], matrix_ortho[:, i])
if abs(devider) < 1e-16: # 避免除以0
matrix_ortho[:, i] *= 0
else:
devider = np.sqrt(devider)
matrix_ortho[:, i] /= devider
coefficient[i, :] *= devider
print('result:', matrix_ortho)
return matrix_ortho, coefficient
例如输入矩阵
[[1, 2, 2],
[1, 0, 2],
[0, 1, 1]]
得到:
matrix_ortho=
[[ 0.70710678 0.57735027 -0.40824829]
[ 0.70710678 -0.57735027 0.40824829]
[ 0. 0.57735027 0.81649658]]
coefficient=
[[1.41421356 1.41421356 2.82842712]
[0. 1.73205081 0.57735027]
[0. 0. 0.81649658]]
matrix_ortho * coefficient =
[[ 1.00000000e+00 2.00000000e+00 2.00000000e+00]
[ 1.00000000e+00 -7.64210971e-17 2.00000000e+00]
[ 0.00000000e+00 1.00000000e+00 1.00000000e+00]]
由于数值计算有微小的误差,所以可以看到结果里面有-7.64210971e-17
而不是正好等于0。
QR分解
施密特正交化得到的向量组正交,系数矩阵是上三角,满足A=QR。所以施密特正交化的结果就是矩阵的QR分解结果。
其他
对于非满秩矩阵可以先用单位向量扩展到满秩矩阵,然后再做施密特正交化。