pandas 的json功能

本文介绍如何使用Pandas库进行DataFrame与JSON之间的转换,包括不同orient参数的影响,以及如何利用json_normalize函数将半结构化的JSON数据转换为平面表格。涵盖了JSON读取的参数设置,如日期解析和数据类型推断。
import pandas as pd
dfj = pd.DataFrame(np.random.randn(5, 2), columns=list('AB'))
dfj.to_json()
 '{"A":{"0":-1.2945235903,"1":0.2766617129,"2":-0.0139597524,"3":-0.0061535699,"4":0.8957173022},"B":{"0":0.4137381054,"1":-0.472034511,"2":-0.3625429925,"3":-0.923060654,"4":0.8052440254}}'



dfjo = pd.DataFrame(dict(A=range(1, 4), B=range(4, 7), C=range(7, 10)),columns=list('ABC'), index=list('xyz'))
sjo = pd.Series(dict(x=15, y=16, z=17), name='D')
dfjo.to_json(orient="columns")
dfjo.to_json(orient="index")
sjo.to_json(orient="index")
dfjo.to_json(orient="records")
sjo.to_json(orient="records")
sjo.to_json(orient="split")
dfd.to_json(date_format='iso')
dfd.to_json(date_format='iso', date_unit='us')
dfd.to_json(date_format='epoch', date_unit='s')
pd.DataFrame([1.0, 2.0, complex(1.0, 2.0)]).to_json(default_handler=str)
read_json()参数
dtype : if True, infer dtypes, if a dict of column to dtype, then use those, if False, then don’t infer dtypes at all, default is True, apply only to the data.
convert_axes : boolean, try to convert the axes to the proper dtypes, default is True
convert_dates : a list of columns to parse for dates; If True, then try to parse date-like columns, default is True.
keep_default_dates : boolean, default True. If parsing dates, then parse the default date-like columns.
numpy : direct decoding to NumPy arrays. default is False; Supports numeric data only, although labels may be non-numeric. Also note that the JSON ordering MUST be the same for each term if numpy=True.
precise_float : boolean, default False. Set to enable usage of higher precision (strtod) function when decoding string to double values. Default (False) is to use fast but less precise builtin functionality.
date_unit : string, the timestamp unit to detect if converting dates. Default None. By default the timestamp precision will be detected, if this is not desired then pass one of ‘s’, ‘ms’, ‘us’ or ‘ns’ to force timestamp precision to seconds, milliseconds, microseconds or nanoseconds respectively.
lines : reads file as one json object per line.
encoding : The encoding to use to decode py3 bytes.
chunksize : when used in combination with lines=True, return a JsonReader which reads in chunksize lines per iteration.
pd.read_json('test.json', dtype=object).dtypes
pd.read_json('test.json', dtype={'A': 'float32', 'bools': 'int8'}).dtypes
pd.read_json(json, convert_axes=False)
json = dfj2.to_json(date_unit='ns')
dfju = pd.read_json(json, date_unit='ms')
pandas提供了一个实用功能,可以获取dict或dict列表,并将半结构化数据规范化为一个平面表。
from pandas.io.json import json_normalize
data = [{'id': 1, 'name': {'first': 'Coleen', 'last': 'Volk'}},{'name': {'given': 'Mose', 'family': 'Regner'}},{'id': 2, 'name': 'Faye Raker'}]
json_normalize(data)
 data = [{'state': 'Florida',
   .....:          'shortname': 'FL',
   .....:          'info': {'governor': 'Rick Scott'},
   .....:          'counties': [{'name': 'Dade', 'population': 12345},
   .....:                       {'name': 'Broward', 'population': 40000},
   .....:                       {'name': 'Palm Beach', 'population': 60000}]},
   .....:         {'state': 'Ohio',
   .....:          'shortname': 'OH',
   .....:          'info': {'governor': 'John Kasich'},
   .....:          'counties': [{'name': 'Summit', 'population': 1234},
   .....:                       {'name': 'Cuyahoga', 'population': 1337}]}]
json_normalize(data, 'counties', ['state', 'shortname', ['info', 'governor']])
 data = [{'CreatedBy': {'Name': 'User001'},
   .....:          'Lookup': {'TextField': 'Some text',
   .....:                     'UserField': {'Id': 'ID001',
   .....:                                   'Name': 'Name001'}},
   .....:          'Image': {'a': 'b'}
   .....:          }]
json_normalize(data, max_level=1)
dfs = pd.read_html(url, header=0)
dfs = pd.read_html(url, index_col=0)
dfs = pd.read_html(url, skiprows=0)
dfs = pd.read_html(url, skiprows=range(2))

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值