pymongo库学习

最新推荐文章于 2024-06-23 14:46:45 发布

原创最新推荐文章于 2024-06-23 14:46:45 发布 · 410 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#python #mongodb #pymongo

python基础同时被 2 个专栏收录

14 篇文章

订阅专栏

数据库

1 篇文章

订阅专栏

参考资料：pymongo官方Tutorial

安装pymongo

pip install pymongo

连接（Connection）pymongo

有三种连接方式

# 方法一：使用默认值
>>> from pymongo import MongoClient
>>> client = MongoClient()

# 方法二：指定IP及端口
>>> client = MongoClient('localhost', 27017)

# 方法三：以MongDB URI 格式连接
>>> client = MongClient('mongdb://localhost:27017/')

统一资源标识符（Uniform Resource Identifier，或URI)是一个用于标识某一互联网资源名称的字符串。

连接数据库（Databse）

# 方式一： 使用attribute style，也就是以"."这种形式

>>> db = client.test_database

这种方法对于不符合python命名规则的数据库就无法使用了（如：“test-database”）

# 方式二：使用dictionary style, 
>>> db = client['test-database']

这种方式就可以解决数据库名称包含特殊符号的问题

连接Collection

和Database连接方式相同

Mongdb中与关系型数据库（如：Mysql、MSSQLServer）中的概念不大相同，这里的collection就是MSSQL中的“表”，所以在本文中都以英文表示

这里可以把Collection理解成“集合”的意思

>>> collection = db.test_collection

>>> collection = db['test-collection']

Documents

document对应着“行”，可以理解为一条记录

准备数据

>>> import datetime
>>> post = {
        "author": "Mike",
        "text": "My first blog post!",
        "tags": [
            "mongodb",
            "python", 
            "pymongo"
            ],
        "date": datetime.datetime.utcnow()
        }

插入Document

>>> posts = db.posts
>>> post_id = posts.insert_one(post).inserted_id
>>> post_id
ObjectedId('...')

第一步创建了名为posts的collection
第二步包含了两个过程：向posts中插入之前准备的数据（post），同时插入完成后返回inserted_id
第三步：调用post_id

执行完上述操作后，会在mongodb服务端创建数据库并插入数据

>>> db.collection_names(include_system_collections=False)
>>> db.collection_names?
# Get a list of all the collection names in this database.
# 这个方法会返回当前数据库中的所有的collection的列表

获取一个Document

>>> posts.find_one()
# get the first document from the posts collection
# 会获取posts的第一条记录

>>> posts.find_one({'author': 'Mike'})
>>> 
# 用一个不存在的数据查询，会返回空

通过字典查询，可以理解为key=value

查询通过ObjectId

>>> post_id
ObjectId(...)
>>> posts.find_one({'_id': 'post_id'})
... # 返回查询结果

>>> post_id_as_str = str(post_id)
>>> posts.find_one({'_id': post_id_as_str})
# No result
# ObjectId =/= string

ObjectId不是string类型的对象，所以如果用string查询id，就没有匹配结果

from bson.objectid import ObjectId

# The web framework gets post_id from the URL and passed is as a string.
def get(post_id):
    # Convert from string to ObjectId:
    document = client.db.collection.find_one({'_id': ObjectId(post_id)})

获取指定collection中的所有记录

>>> for i in db.get_collections('posts').find({}):
        print(i)

get_collections方法中传入collection名称
db.get_collections()返回的是一个可迭代对象
find()中可以为空字典或空，也就是过滤条件为空，取全部记录

批量插入数据

>>> new_posts = [{ 'author': 'Mike',
                   'text': 'Another post!',
                   'tags': ['bulk', 'insert'],
                   'date': datetime.datetime(2009, 11, 12, 11, 14)},
                 { 'author': 'Eliot',
                   'title': 'MongoDB is fun',
                   'text': 'and pretty easy too!',
                   'date': datetime.datetime(2009, 11, 10, 10, 45)}                 
]
>>> result = posts.insert_many(new_posts)
>>> result.inserted_ids
[ObjectId('...'), ObjectId('...')]

insert_many() 会返回两个ObjectId实例
new_posts[1] has a different “shape” than the other posts - there is no “tags” field and we’ve added a new field, “title”. This is what we mean when we say that MongoDB is schema-free.

查找多个数据

>>> for post in posts.find():
        print(post)
...There is the result...

>>> for post in posts.find({'author': 'Mike'}):
        print(post)

计数（Counting）

>>> posts.count()
3
>>> posts.find({'author':'Mike'}).count()

在查询结果后使用count()方法即可

按范围查找（Range Queries）

>>> d = datetime.datetime(2017, 6, 28, 14)
>>> for post in posts.find({"date"}:{"$lt": d}).sort("author"):
        print(post)

经验证“$lt”过滤的是这个时间点以前的所有记录

索引（Indexing）

# create index
# 创建索引
>>> result = result = db.profiles.create_index([('user_id', pymongo.ASCENDING)],unique=True)
>>> sorted(list(db.profiles.index_information()))

# insert documents
# 插入数据
>>> user_profiles = [
    {'user_id': 211. 'name': 'Luke'},
    {'user_id': 212, 'name': 'Ziltoid'}
]
>>> result = db.profile.insert_many(user_profiles)

# insert a document whose user_id is already in the collection
# 插入一个集合中已经存在的user_id
>>> new_profile = {'user_id': 213, 'name': 'Drew'}
>>> duplicate_profile = {'user_id': 213,  'name': 'Tommy'}
>>> db.profiles.insert_one(new_profile)
>>> db.profiles.insert_one(duplicate_profile)

Mongodb数据库中的概念

SQL术语/概念	MongoDB术语/概念	解释/说明
database	database	数据库
table	collection	数据库表/集合
row	document	数据记录行/文档
column	field	数据字段/域
index	index	索引
table joins		表连接,MongoDB不支持
primary key	primary key	主键,MongoDB自动将_id字段设置为主键

参考资料：菜鸟教程

总结

1、创建连接

# 有三种连接方式
>>> client = MongoClient()
# or
>>> client = MongoClient(IP, Port)
# or
>>> client = MongoClient('mongodb://ip:port/')

2、连接到指定数据库(Database)

>>> db = client.database_name
# or
>>> db = client['database_name']

3、连接到指定collection

>>> collection = db.collection_name
# or
>>> collection = db['collection_name']

4、插入数据

# insert a document
>>> single_data = {'name': 'Tom', 'weight': 100}
>>> db.test.insert_one(single_data)
# insert a lots of document
>>> many_data = [{'name': 'Jerry', 'weight': 150},{'name': 'Blue', 'weight': 200}]
>>> db.test.insert_many(many_data)

5、查询数据

# query all
>>> for row in db.collection_name.find():
        print(row)
# query by ObjectId
# find_one()参数为空时会默认取第一条数据
>>> print(posts.find_one('_id': ObjectId('...')))
...
# query a set of documents
>>> for row in db.collection_name.find({'key': 'value'}):
        print(row)
# 可以在查询结果后使用sort()
# db.collection_name.find().sort('key')