numpy.savetext()存储csv文件并使用 pandas读取文件时报错
numpy.savetext()方法:
numpy.savetxt(fname, X, fmt=‘%.18e’, delimiter=’ ', newline=‘\n’, header=‘’, footer=‘’, comments='# ', encoding=None)
delimiter是以其为标识来分隔 csv则使用**‘,’**进行分隔
header是写一个列名
encoding是编码方式
pandas读取csv文件方法:
pandas.read_csv(filepath_or_buffer, ***, sep=_NoDefault.no_default, delimiter=None, header=‘infer’, names=_NoDefault.no_default, index_col=None, usecols=None, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=None, infer_datetime_format=_NoDefault.no_default, keep_date_col=False, date_parser=_NoDefault.no_default, date_format=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression=‘infer’, thousands=None, decimal=‘.’, lineterminator=None, quotechar=‘"’, quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, encoding_errors=‘strict’, dialect=None, on_bad_lines=‘error’, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None, storage_options=None, dtype_backend=_NoDefault.no_default)
这里我们就用到pandas.read_csv(path,header)
path是字符类型的csv文件路径:str, path object or file-like object
header默认为0,即显示第一列为列名
错误代码:
import pandas as pd
import numpy as np
data = np.random.randint(1, 10, 100)
np.savetxt('demo_csv.csv', data, delimiter=',', fmt='%.3e', header="数据")
indata = pd.read_csv('./demo_csv.csv', header=0)
print(indata)
生成100个1到10随机数,然后使用np.savetext()方法将其存储到demo_csv.csv文件中
再使用pandas读取数据
报错信息:
E:\python\python_sduty_project\venv\Scripts\python.exe E:\python\python_sduty_project\csv文件pandas用法.py
Traceback (most recent call last):
File "E:\python\python_sduty_project\csv文件pandas用法.py", line 9, in <module>
indata = pd.read_csv('./demo_csv.csv', header=0)
File "E:\python\python_sduty_project\venv\lib\site-packages\pandas\io\parsers\readers.py", line 912, in read_csv
return _read(filepath_or_buffer, kwds)
File "E:\python\python_sduty_project\venv\lib\site-packages\pandas\io\parsers\readers.py", line 577, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "E:\python\python_sduty_project\venv\lib\site-packages\pandas\io\parsers\readers.py", line 1407, in __init__
self._engine = self._make_engine(f, self.engine)
File "E:\python\python_sduty_project\venv\lib\site-packages\pandas\io\parsers\readers.py", line 1679, in _make_engine
return mapping[engine](f, **self.options)
File "E:\python\python_sduty_project\venv\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 93, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas\_libs\parsers.pyx", line 548, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 637, in pandas._libs.parsers.TextReader._get_header
File "pandas\_libs\parsers.pyx", line 848, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas\_libs\parsers.pyx", line 859, in pandas._libs.parsers.TextReader._check_tokenize_status
File "pandas\_libs\parsers.pyx", line 2017, in pandas._libs.parsers.raise_parser_error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xca in position 2: invalid continuation byte
原因:encoding编码错误
import pandas as pd
import numpy as np
data = np.random.randint(1, 10, 100)
np.savetxt('demo_csv.csv', data, delimiter=',', fmt='%.3e', header="数据", encoding='utf-8')
indata = pd.read_csv('./demo_csv.csv', header=0, encoding='utf-8')
print(indata)
改为上面代码即可
运行结果如下:
E:\python\python_sduty_project\venv\Scripts\python.exe E:\python\python_sduty_project\csv文件pandas用法.py
# 数据
0 6.0
1 3.0
2 3.0
3 8.0
4 7.0
.. ...
95 3.0
96 8.0
97 4.0
98 8.0
99 8.0
[100 rows x 1 columns]
进程已结束,退出代码0