In scikit-learn, everything with a fit_transform is an instance of some type, which is to say that you’ll need to initialize that instance first, where you are calling fit_transform as if it were a staticmethod.
So, either create the instance by letting vectorizer = TfidfVectorizer() and use vectorizer.fit_transform(data.status), or just use TfidfVectorizer().fit_transform(data.status) directly.
小结:
(1)TfidfVectorizer()
TfidfVectorizer()是一个类,使用前需要实例化:
vectorizer = TfidfVectorizer();
然后再调用其方法:
vectorizer.fit_transform(data.status)
或者是直接调用其方法:
TfidfVectorizer().fit_transform(data.status)
(2)关于fit、transfor 和 fit_transform
fit_transform是fit和transform的结合。
(3)CountVectorizer 和 TfidfVectorizer:
用sklearn进行TF-IDF预处理的两种方式:
第一种方法是在用 CountVectorizer 类向量化之后再调用 TfidfTransformer 类进行预处理;
第二种方法是直接用 TfidfVectorizer 完成向量化与 TF-IDF 预处理。
https://blog.youkuaiyun.com/m0_37324740/article/details/79411651