debug:TypeError: Feature names are only supported if all input features have string names, but your-优快云博客

本文链接：https://blog.youkuaiyun.com/qq_52501948/article/details/144156823

一个bug卡了我好几天，写个文档记录一下处理过程，我个人思考的也不是很清楚，仅用做记录，希望日后能更深入的解决

问题产生：

训练随机森林等算法

df.head()

打印结果

track_id track_name popularity duration_ms explicit artists artists_id release_date danceability energy ... r&b mandopop japanese rap hong indie kong house singer songwriter
0 5KpWHEh32vzxkttIK3KHKI 國際孤獨等級 51 193747 False ['Gareth.T'] 6R57JlNKlnNrYaji0vw8xx 2023-03-03 0.692 0.189 ... 0 0 0 0 1 0 1 0 0 0
1 1sb71AvysPMJlsx4qYtTpG 緊急聯絡人 58 222668 False ['Gareth.T'] 6R57JlNKlnNrYaji0vw8xx 2023-11-30 0.513 0.373 ... 0 0 0 0 1 0 1 0 0 0
2 2mMgDVazhRjNoOweYMP1pz 青春告別式 50 256967 False ['Hins Cheung'] 2MVfNjocvNrE03cQuxpsWK 2023-12-31 0.433 0.380 ... 0 0 0 0 0 0 0 0 0 0
3 6UuJk5rvrxSnOAwv6uSr5b 給你幸福所以幸福 51 244693 False ['Jay Fung'] 4EXI1ieJe2VDbvNsKOaNQL 2023-10-24 0.414 0.456 ... 0 0 0 0 0 0 0 0 0 0
4 1mUhvuqX0ScGodDTdnRtuL 永久損毀 49 232002 False ['MC 張天賦', 'Panther Chan'] 5tRk0bqMQubKAVowp35XtC 2023-12-19 0.514 0.405 ... 0 0 0 0 0 0 0 0 0 0

5 rows × 45 columns

以上是数据内容

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=0)
print(x_train.columns.tolist())
print(x_train.columns.dtype)
#TODO
x_train= x_train.rename(str,axis="columns") 

# 决策树模型
dt_model = DecisionTreeRegressor(random_state=0)
dt_model.fit(x_train, y_train)
y_pred_dt = dt_model.predict(x_test)

在dt_model.fit(x_train, y_train)报错，

TypeError: Feature names are only supported if all input features have stringAsk Qnames, but your input has ['str', 'str_'] as column name types

解决思路一

对数据类型进行转换，

x.columns = x.columns.astype(str)
print(x.columns.dtype)
print(x.columns.tolist())
for col_name in x.columns:
    assert isinstance(col_name, str)

其实这里核心的是x.columns = x.columns.astype(str)，一句就够了

但是经过上述代码，column的类型已经变成object，即字符串型，但是在上面的dt_model.fit(x_train, y_train)仍然报错相同，这里的失败原因尚不可知

解决思路二

x_train= x_train.rename(str,axis="columns")

换了一种解决方法在StackOverflow中查到的，没想到一下子就好了，原理和方法一类似，我还没找到为什么这个可以

track_id	track_name	popularity	duration_ms	explicit	artists	artists_id	release_date	danceability	energy	...	r&b	indie	house
0	5KpWHEh32vzxkttIK3KHKI	國際孤獨等級	51	193747	False	['Gareth.T']	6R57JlNKlnNrYaji0vw8xx	2023-03-03	0.692	0.189	...	1	1
1	1sb71AvysPMJlsx4qYtTpG	緊急聯絡人	58	222668	False	['Gareth.T']	6R57JlNKlnNrYaji0vw8xx	2023-11-30	0.513	0.373	...	1	1
2	2mMgDVazhRjNoOweYMP1pz	青春告別式	50	256967	False	['Hins Cheung']	2MVfNjocvNrE03cQuxpsWK	2023-12-31	0.433	0.380	...	0	0
3	6UuJk5rvrxSnOAwv6uSr5b	給你幸福所以幸福	51	244693	False	['Jay Fung']	4EXI1ieJe2VDbvNsKOaNQL	2023-10-24	0.414	0.456	...	0	0
4	1mUhvuqX0ScGodDTdnRtuL	永久損毀	49	232002	False	['MC 張天賦', 'Panther Chan']	5tRk0bqMQubKAVowp35XtC	2023-12-19	0.514	0.405	...	0	0