《Python数据挖掘入门与实践》Robert Layton 人民邮电出版社
#加载数据集
import numpy as np #导入模块numpy并以np作为别名,np可自行定义
dataset_filename = "affinity_dataset.txt" #设置数据集的路径和文件名。例如C:/Dataset/name.txt
X=np.loadtxt(dataset_filename) #loadtxt读取数据集
n_samples, n_features = X.shape #shape读取矩阵形状,二维矩阵返回行列
print("This dataset has {0} samples and {1} features".format(n_samples, n_features)) #format字符串格式化
#打几行看看数据集的样子
print (X[:5]) #[:5]即编号分别为0-4的前五行数据
# The names of the features, for your reference.
features = ["bread", "milk", "cheese", "apples", "bananas"] #列表
In our first example, we will compute the Support and Confidence of the rule “If a person buys Apples, they also buy Bananas”.
# First, how many rows contain our premise: that a person is buying apples
num_apple_purchases = 0 #购买苹果的总人数,初值为0
for sample in X: #for...in语句,sample遍历X的每一行
if sample[3] == 1: #一行中的第四个特征值为1代表此人购买了苹果
num_apple_purchases += 1
print("{0} people bought Apples".format(num_apple_purchases))
# How many of the cases that