A dumb way to solve Exercise 1.2 of Machine Learning (by Zhihua Zhou). The basic logic is to transform the combinatorial case of taking k out of 48 (redundant) into a 2*3*3 matrix (non-redundant). The essential principle underlying is the uniqueness of the minimalist disjunctive normal form.
Regarding the origin of redundancy: in each combinatorial case, the generalization of a certain feature may lead to redundancy; there may also be duplications of the simplified disjunction paradigm between multiple combinatorial cases. The uniqueness of the minimal disjunction paradigm ensures that all combinations in the set are completely taken and do not generate redundancy.
The limitation of the method is that the method essentially enumerates all the cases and then counts the quantities. The time complexity of the traversal process is so high that it can take a long time to compute in a certain range of k. If it can be derived mathematically it should be much faster.
import numpy as np
import itertools as it
# input para: traid -> a traid demonstrating features' values
# output: a 18-dimensional vector demonstrating minimalist disjunctive normal form
def triad_to_vector(triad):
dimen1 = triad[0]
dimen2 = triad[1]
dimen3 = triad[2]
if dimen1 == 3:
dimen1 = [1, 2]
else:
dimen1 = [dimen1]
if dimen2 == 4:
dimen2 = [1, 2, 3]
else:
dimen2 = [dimen2]
if dimen3 == 4:
dimen3 = [1, 2, 3]
else:
dimen3 = [dimen3]
# print (dimen1, dimen2, dimen3)
vector = np.zeros([2, 3, 3])
for s1 in dimen1:
for s2 in dimen2:
for s3 in dimen3:
vector[s1-1][s2-1][s3-1] = 1
return vector
# input para: num -> from 0 to 47, a number demonstrating a conjunction paradiam
# output: a traid demonstrating features' values
def index_to_triad(num):
k = int(num % 4)
temp1 = (num -k) / 4
# print(temp1)
j = int(temp1 % 4)
temp2 = (temp1 - j) / 4
# print(temp2)
i = int(temp2 % 3)
return [i+1, j+1, k+1]
# input para: num -> from 0 to 47, a number demonstrating a conjunction paradiam
# output: a 18-dimensional vector demonstrating minimalist disjunctive normal form(DNF)
def index_to_vector(num):
temp_traid = index_to_triad(num)
temp_vector = triad_to_vector(temp_traid)
return temp_vector
# input para: k -> number of how many the conjunction paradiams are
# output: the amounts of hypothesis the disjunctive normal form can demonstrate
def get_amounts_of_hypotheses(k):
final_s = []
for i in it.combinations(range(48), k):
temp_s = []
for j in range(k):
temp_vector = index_to_vector(i[j])
temp_s.append(temp_vector)
temp_s = np.array(temp_s)
temp_s = temp_s.any(axis=0) # take bitwise-and operation on all list elements in temp_s, yielding an 18-dimensional vector
map_number = 0 # for every 18-dimensional vector, it can be mapped to an integer from 1 to 2^18
for i in range(2):
for j in range(3):
for m in range(3):
map_number += (2**(9*i+3*j+m)) * temp_s[i][j][m]
final_s.append(map_number)
final_s = list(set(final_s))
print("%d: %d examples"%(k, len(final_s)))
return(len(final_s))
get_amounts_of_hypotheses(4)
#return "4: 40911 examples"
1017

被折叠的 条评论
为什么被折叠?



