在工作中经常用到mongodb 作为数据的存取,但是有时候进行数据的处理,一条条的读取,感觉挺慢,用pandas 进行读取速度杠杠的,那么问题来了怎么读取
mongodb中的数据呢?
# 导入用到的包
# coding:utf-8 import <span class="wp_keywordlink_affiliate"><a href="https://www.168seo.cn/tag/pandas" title="View all posts in pandas" target="_blank">pandas</a></span> from pymongo import MongoClient
|
1
2
3
4
|
创造一个和Mongondb之间的链接
""" 创造一个和Mongondb之间的链接 """ client = MongoClient('localhost', 27017)
|
1
2
3
4
5
|
"""
创造一个和Mongondb之间的链接
"""
client
=
MongoClient
(
'localhost'
,
27017
)
|
# 链接数据库
db = client.taobao
|
1
|
db
=
client
.
taobao
|
# 链接 collection
collection = db.products
|
1
|
collection
=
db
.
products
|
# list(collection.find()) 查询并转换成 list
df = pandas.DataFrame(list(collection.find()))
|
1
|
df
=
pandas
.
DataFrame
(
list
(
collection
.
find
(
)
)
)
|
Mongodb 数据字段结构 字段结构是这样的
""" { "_id" : ObjectId("598ad92976766c3c4ced600a"), "price" : "¥2448.00", "title" : "【未拆封】Apple/苹果 iPad 9.7英寸WLAN 平板电脑2017版 正品", "shop" : "绿森数码官方旗舰店", "image" : "//g-search1.alicdn.com/img/bao/uploaded/i4/imgextra/i1/1258106069108947116/TB225t7yb4npuFjSZFmXXXl4FXa_!!0-saturn_solar.jpg", "deal" : "161人付款", "location" : "浙江 杭州" } """
|
1
2
3
4
5
6
7
8
9
10
11
12
|
"""
{
"_id" : ObjectId("598ad92976766c3c4ced600a"),
"price" : "¥2448.00",
"title" : "【未拆封】Apple/苹果 iPad 9.7英寸WLAN 平板电脑2017版 正品",
"shop" : "绿森数码官方旗舰店",
"image" : "//g-search1.alicdn.com/img/bao/uploaded/i4/imgextra/i1/1258106069108947116/TB225t7yb4npuFjSZFmXXXl4FXa_!!0-saturn_solar.jpg",
"deal" : "161人付款",
"location" : "浙江 杭州"
}
"""
|
# 删除mongodb自动添的_id字段
del df['_id']
|
1
|
del
df
[
'_id'
]
|
# 选择需要显示的字段
data = df[['price','shop','price','deal','location']] print(data)
|
1
2
3
4
|
data
=
df
[
[
'price'
,
'shop'
,
'price'
,
'deal'
,
'location'
]
]
print
(
data
)
|
打印的结果是:
1160





