aone322-优快云博客

原创 pandas 按行数groupby

一个n行的DataFrame，以每十行分一组进行处理。方法是利用index除10然后向下取整，每十行自动分为一组。df.groupby(lambda x: math.floor(x/10))

2020-09-24 10:02:55 1961 1

原创 FacebookCTR预估模型理解

模型采用GBDT做特征筛选和特征组合，利用每颗子树落入的叶子节点的位置转化为特征向量，将特征向量concatenate，得到一个特征向量，把向量放入LR中输入，预测CTR。采样方法分两种，一种所有样本无差采样，另一中保留全量正样本，对负样本降采样。负采样带来的问题时CRT的预估值偏移。...

2020-09-17 18:09:19 194

原创 pandas报错ValueError: At based indexing on an integer index can only have integer indexers

想用索引把dataframe某一为位置的具体值提取出来， ans = df.loc[df[df['proba'] == max(df['proba'])].index, 'item_kind']结果提取出的结果为一个dataframe类型，改用at提取后ans = df.at[df[df['proba'] == max(df['proba'])].index, 'item_kind']报错：ValueError: At based indexing on an integer index ca

2020-09-17 17:42:28 4189

原创 lightgbm画决策树方法

lightgbm画树的两种方法，画树前需要安装graphviz画图工具包pip install graphviz安装后运行画图代码报错，找不到grahviz的dot什么的，因为graphviz不仅要在pyhton环境中安装，需要同时在系统安装graphviz:Linux下安装方法为：$sudo yum install graphviz一、使用plot_treefig = plt.figure(figsize=(200, 200))ax = fig.subplots()lgb.plot_

2020-09-17 11:43:26 4069

原创 LGB交叉验证KFOLD

# 引入相关包import pandas as pdimport lightgbm as lgbfrom sklearn.model_selection import StratifiedKFold# 假设这里准备好了训练数据train_data，它是一个pandas的dataframe，包括特征列和score列train_label = train_data['score']# 初始化一个k-fold生成器NFOLDS = 5kfold = StratifiedKFold(n_spli

2020-09-15 17:45:34 1522

原创 matplotlib.lines.Line2D object at 0x0不出图

在jupyter notebook中画图时没有出现图，加plt.show()也不好使。查询也没答案。运行代码时只显示下面信息matplotlib.lines.Line2D object at 加一个地址解决方案：重启试试...

2020-02-19 00:54:20 9355 3

原创 mac anaconda终端显示“进程已完成”

之前修改bash_profile文件导致anaconda的终端启动时出现，“进程已完成”。网上查询后发现一般出现“进程已完成”的情况时，bash_profile没有配置对，所以找到bash_profile文件，进行修改环境变量。在终端输入命令转到home目录：$ cd ~打开bash_profile文件$ open .bash_profile出现bash_profile的文本文件...

2019-11-01 17:00:39 1955

weixin_42391768的博客