python csv文件打开错误:_csv.Error: line contains NULL byte

本文介绍了如何使用Python读取不同格式的CSV文件,并提供了针对ucs-2le格式CSV文件的处理方法。包括如何获取特定行和列的数据,进行类型转换及简单的数据统计。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

正常的csv文件读取如下:

#coding:utf-8
import csv
csvfilename = 'demo.csv'


print u'################获取某一行'
with open(csvfilename, 'rb') as csvfile:
    reader = csv.reader(csvfile)
    rows = [row for row in reader]
print rows[0], rows[1], rows[2], rows[3]

print u'################获取某一列'
with open(csvfilename,'rb') as csvfile:
    reader = csv.reader(csvfile)
    column0 = [row[0] for row in reader]
with open(csvfilename, 'rb') as csvfile:
    reader = csv.reader(csvfile)
    column1 = [row[2] for row in reader]

print column0, column1
s = [1,2,3]
for i in column0:
    print type(i)
# print u'sum:',sum(column0)
new_column0 = column0.pop(0)
print u'删除的元素为:', new_column0
print u'删除后的列表:', column0
print type(column0)
for i in column0:
    print type(i)

 

读取一个   ucs-2 le 格式(notepa++打开csv)的csv就会报错:  Python CSV error: line contains NULL byte   参考了这个文章里面的内容

https://stackoverflow.com/questions/4166070/python-csv-error-line-contains-null-byte

 

代码如下:

#coding:utf-8
from __future__ import division
import csv
import codecs
import xlwt
import pandas as pd
# twsfilename = "mem.csv"#134列
twsfilename = "tws.csv"#123列

#读取行
print u'################获取某一行'
with codecs.open(twsfilename, 'rb', "utf-16") as csvfile:
    reader = csv.reader(csvfile)
    column1 = [row for row in reader]
    print column1[1][0].split("    ")
    print type(column1[1][0].split("    "))

print u'################获取某一列'
with codecs.open(twsfilename, 'rb', "utf-16") as csvfile:
    reader = csv.reader(csvfile, delimiter='\t')
    reader.next()#向下跳一行 这行可以注释掉  主要为了去掉标题行
    column1 = [row[1] for row in reader]
    print column1
    print "max:",max(column1)
    s = 0
    for i in column1:
        x = float(i)
        s += x
    print "sum:",s,"count:",len(column1)
    # round (s / len(column1), 3)
    print "avg",round (s / len(column1), 3)

 

 

 

最后感谢大神  参考了很多都搞不定 什么.replace('\0','')啊 另存啊 都搞不定  给你几百个这种csv你难道一个个另存啊!

获取list后怎么存成svs 看这个 http://www.cnblogs.com/hanxing/p/6905094.html

 

转载于:https://www.cnblogs.com/hanxing/p/6906268.html

--------------------------------------------------------------------------- KeyError Traceback (most recent call last) File ~\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:3791, in Index.get_loc(self, key) 3790 try: -> 3791 return self._engine.get_loc(casted_key) 3792 except KeyError as err: File index.pyx:152, in pandas._libs.index.IndexEngine.get_loc() File index.pyx:181, in pandas._libs.index.IndexEngine.get_loc() File pandas\_libs\hashtable_class_helper.pxi:7080, in pandas._libs.hashtable.PyObjectHashTable.get_item() File pandas\_libs\hashtable_class_helper.pxi:7088, in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'date' The above exception was the direct cause of the following exception: KeyError Traceback (most recent call last) Cell In[5], line 409 406 print("训练流程完成!") 408 if __name__ == "__main__": --> 409 main() Cell In[5], line 326, in main() 323 positive_samples['label'] = 1 325 # 负样本采样 --> 326 negative_samples = vectorized_negative_sampling( 327 hist_exposure, positive_samples, sample_ratio=0.05 328 ) 330 # 合并数据集 331 click_data = pd.concat([positive_samples, negative_samples], ignore_index=True) Cell In[5], line 130, in vectorized_negative_sampling(exposure, positive_set, sample_ratio) 127 negative_samples['label'] = 0 129 # 添加时间信息(使用最近曝光时间) --> 130 negative_samples['date'] = exposure['date'].max() 132 return negative_samples File ~\anaconda3\Lib\site-packages\pandas\core\frame.py:3893, in DataFrame.__getitem__(self, key) 3891 if self.columns.nlevels > 1: 3892 return self._getitem_multilevel(key) -> 3893 indexer = self.columns.get_loc(key) 3894 if is_integer(indexer): 3895 indexer = [indexer] File ~\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:3798, in Index.get_loc(self, key) 3793 if isinstance(casted_key, slice) or ( 3794 isinstance(casted_key, abc.Iterable) 3795 and any(isinstance(x, slice) for x in casted_key) 3796 ): 3797 raise InvalidIndexError(key) -> 3798 raise KeyError(key) from err 3799 except TypeError: 3800 # If we have a listlike key, _check_indexing_error will raise 3801 # InvalidIndexError. Otherwise we fall through and re-raise 3802 # the TypeError. 3803 self._check_indexing_error(key) KeyError: 'date'
最新发布
07-13
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值