【数据分析项目实战】AB测试：去除无控制数据，shapiro,mannwhitneyu,ttest_ind

原创

于 2024-06-13 17:45:16 发布 · 484 阅读

9 ·

CC 4.0 BY-SA版权

文章标签：

#数据分析 #ab测试 #数据挖掘 #python

数据集介绍：10000条数据，被分为对照组和实验组，研究variant是否为revenue带来改变。

一、清洗数据

新增一个可复用的查看函数：
使用了f-string进行格式化

def jiben_xinxi(data,head=5): #定义一个之后都可以用的“基本信息”函数
    #等价print('shape:{}'.format(data.shape))
    print(f'Shape:{
     
     data.shape}')
    #等价print('Head'.center(70,'-'))
    #>居右，<居左，填充符号+命令+宽度
    print(f'{
     
     "Head":-^70}')
    print(data.head(head))
    print(f'{
     
     "Dtypes":-^70}')#print("Dtypes".center(70,'-'))
    print(data.dtypes)
    print(f'{
     
     "NULL":-^70}')
    print(data.isnull().sum())
    print(f'{
     
     "Describe":-^70}')
    print(data.describe().