A Year with MongoDB

本文回顾了一年内使用MongoDB的经验,并详细介绍了从MongoDB迁移至Riak和PostgreSQL的过程及原因。文章指出了MongoDB的一些优点,如无模式的数据结构、简单的复制集设置和丰富的查询语言等,同时也揭示了其在内存管理、B树计数等方面存在的问题。
原文地址:[url]http://blog.engineering.kiip.me/post/20988881092/a-year-with-mongodb[/url]
This week marks the one year anniversary of Kiip running MongoDB in production. As of this week, we’ve also moved over 95% of our data off of MongoDB onto systems such as Riak and PostgreSQL, depending which solution made sense for the way we use our data. This post highlights our experience with MongoDB over the past year. A future post will elaborate on the migration process: how we evaluated the proper solutions to migrate to and how we migrated the data from MongoDB.

First, some numbers about our data to give context to the scale being discussed. The figures below represent the peak usage when we were completely on MongoDB — the numbers are actually much higher now but are spread across different data stores.

Data size: 240 GB
Total documents: 85,000,000
Operations per second: 520 (Create, reads, updates, etc.)
The Good

We were initially attracted to MongoDB due to the features highlighted on the website as well as word of mouth from those who had used it successfully. MongoDB delivered on some of its promises, and our early experiences were positive.

Schemaless - Being a document data store, the schemaless-nature of MongoDB helps a lot. It is easy to add new fields, and even completely change the structure of a model. We changed the structure of our heaviest used models a couple times in the past year, and instead of going back and updating millions of old documents, we simply added a “version” field to the document and the application handled the logic of reading both the old and new version. This flexibility was useful for both application developers and operations engineers.

Simple replication - Replica Sets are easy to setup and work well enough. There are some issues that I’ll talk about later, but for the most part as an early stage startup, this feature was easy to incorporate and appeared to work as advertised.

Query Language - Querying into documents and being able to perform atomic operations on your data is pretty cool. Both of these features were used heavily. Unfortunately, these queries didn’t scale due to underlying architectural problems. Early on we were able to use advanced queries to build features quickly into our application.

Full-featured Drivers for Many Languages - 10gen curates official MongoDB drivers for many languages, and in our experience the driver for each language we’ve tried has been top-notch. Drivers were never an issue when working with MongoDB.

The Bad

Although MongoDB has a lot of nice features on the surface, most of them are marred by underlying architectural issues. These issues are certainly fixable, but currently limit the practical usage we were able to achieve with MongoDB. This list highlights some of the major issues we ran into.

Non-counting B-Trees - MongoDB uses non-counting B-trees as the underlying data structure to index data. This impacts a lot of what you’re able to do with MongoDB. It means that a simple count of a collection on an indexed field requires Mongo to traverse the entire matching subset of the B-tree. To support limit/offset queries, MongoDB needs to traverse the leaves of the B-tree to that point. This unnecessary traversal causes data you don’t need to be faulted into memory, potentially purging out warm or hot data, hurting your overall throughput. There has been an open ticket for this issue since September, 2010.

Poor Memory Management - MongoDB manages memory by memory mapping your entire data set, leaving page cache management and faulting up to the kernel. A more intelligent scheme would be able to do things like fault in your indexes before use as well as handle faulting in of cold/hot data more effectively. The result is that memory usage can’t be effectively reasoned about, and performance is non-optimal.

Uncompressed field names - If you store 1,000 documents with the key “foo”, then “foo” is stored 1,000 times in your data set. Although MongoDB supports any arbitrary document, in practice most of your field names are similar. It is considered good practice to shorten field names for space optimization. A ticket for this issue has been open since April 2010, yet this problem still exists today. At Kiip, we built field aliasing into our model layer, so a field with name “username” may actually map to “u” in the database. The database should handle this transparently by keeping a logical mapping between field names and a compressed form, instead of requiring clients to handle it explicitly.

Global write lock - MongoDB (as of the current version at the time of writing: 2.0), has a process-wide write lock. Conceptually this makes no sense. A write on collection X blocks a write on collection Y, despite MongoDB having no concept of transactions or join semantics. We reached practical limitations of MongoDB when pushing a mere 200 updates per second to a single server. At this point, all other operations including reads are blocked because of the write lock. When reaching out to 10gen for assistance, they recommended we look into sharding, since that is their general scaling solution. With other RDBMS solutions, we would at least be able to continue vertically scaling for some time before investigating sharding as a solution.

Safe off by default - This is a crazy default, although useful for benchmarks. As a general analogy: it’s like a car manufacturer shipping a car with air bags off, then shrugging and saying “you could’ve turned it on” when something goes wrong. We lost a sizable amount of data at Kiip for some time before realizing what was happening and using safe saves where they made sense (user accounts, billing, etc.).

Offline table compaction - The on-disk data size with MongoDB grows unbounded until you compact the database. Compaction is extremely time consuming and blocks all other DB operations, so it must be done offline or on a secondary/slave server. Traditional RDBMS systems such as PostgreSQL have handled this with auto-vacuums that clean up the database over time.

Secondaries do not keep hot data in RAM - The primary doesn’t relay queries to secondary servers, preventing secondaries from maintaining hot data in memory. This severely hinders the “hot-standby” feature of replica sets, since the moment the primary fails and switches to a secondary, all the hot data must be once again faulted into memory. Faulting in gigabytes of data can be painfully slow, especially when your data is backed by something like EBS. Distributing reads to secondaries helps with this, but if you’re only using secondaries as a means of backup or failover, the effect on throughput when a primary switch happens can be crippling until your hot data is faulted in.

What We’re Doing Now

Initially, we felt MongoDB gave us the flexibility and power we needed in a database. Unfortunately, underlying architectural issues forced us to investigate other solutions rather quickly. We never attempted to horizontally scale MongoDB since our confidence in the product was hurt by the time that was offered as a solution, and because we believe horizontally scaling shouldn’t be necessary for the relatively small amount of ops per second we were sending to MongoDB.

Over the past 6 months, we’ve “scaled” MongoDB by moving data off of it. This process is an entire blog post itself, but the gist of the matter is that we looked at our data access patterns and chose the right tool for the job. For key-value data, we switched to Riak, which provides predictable read/write latencies and is completely horizontally scalable. For smaller sets of relational data where we wanted a rich query layer, we moved to PostgreSQL. A small fraction of our data has been moved to non-durable purely in-memory solutions if it wasn’t important for us to persist or be able to query later.

In retrospect, MongoDB was not the right solution for Kiip. Although it may be a bit more upfront effort, we recommend using PostgreSQL (or some traditional RDBMS) first, then investigating other solutions if and when you find them necessary. In future blog posts, we’ll talk about how we chose our data stores and the steps we took to migrate data while minimizing downtime.
<think>我们正在处理MongoDB错误10334: BSONObjectTooLarge。这个错误发生在尝试插入或更新一个文档时,该文档的大小超过了最大BSON文档大小限制(当前为16MB)。解决方案通常涉及重构数据模型、使用GridFS存储大文件,或者分片存储数据。 根据MongoDB官方文档,我们可以采取以下步骤: 1. 检查文档结构:确定是哪个文档过大,并分析其结构。 2. 重构数据模型:考虑将大文档拆分为多个小文档,或者将部分数据移动到其他集合中,通过引用关联。 3. 使用GridFS:如果文档包含大型二进制数据(如图片、视频等),则使用GridFS存储。 4. 避免大数组:如果文档包含大数组,考虑将其拆分为子文档或使用单独集合存储。 5. 增加流式处理:如果是在聚合操作中遇到此错误,考虑使用allowDiskUse选项。 6. 分片:如果数据量巨大,考虑使用分片来分散数据。 注意:不能通过修改配置来增加BSON文档大小限制,因为16MB是硬性限制。 下面我们将详细说明这些解决方案。 步骤1:识别大文档 使用以下查询可以找到超过16MB的文档(注意:16MB=16777216字节): ```javascript db.collection.find({ $expr: { $gt: [ { $bsonSize: "$$ROOT" }, 16777216 ] } } ) ``` 但是,如果文档已经超过16MB,这个查询可能会失败。因此,可能需要通过应用程序日志或代码检查来确定。 步骤2:重构数据模型 例如,原始文档可能有一个很大的数组: ```json { "_id": 1, "hugeArray": [ ... ] // 包含数百万个元素 } ``` 可以将其拆分为: ```json // 主文档 { "_id": 1 } // 子文档集合(hugeArray拆分成多个文档) { "parentId": 1, "index": 0, "data": [...] } { "parentId": 1, "index": 1, "data": [...] } ``` 步骤3:使用GridFS 对于文件存储,使用GridFS。例如,使用MongoDB的驱动程序,如PyMongo: ```python from pymongo import MongoClient from gridfs import GridFS client = MongoClient() db = client.test_database fs = GridFS(db) with open('large_file.txt', 'rb') as f: file_id = fs.put(f, filename='large_file.txt') ``` 步骤4:避免大数组 如果文档中有大数组,考虑将其拆分成多个文档,或者使用单独集合存储,并通过引用关联。 步骤5:聚合操作中使用allowDiskUse 在聚合操作中,如果因为内存不足而报错(虽然错误不同,但有时也会导致类似问题),可以添加allowDiskUse选项: ```javascript db.collection.aggregate([ ... ], { allowDiskUse: true }) ``` 步骤6:分片 如果整个数据集很大,考虑使用分片集群来分散数据,但这不会直接解决单个文档过大的问题,而是解决数据量大的问题。 另外,更新操作中如果使用$push等操作符导致文档增长超过16MB,也会触发此错误。此时,需要重新设计以避免文档无限增长。 总结:主要思路是避免单个文档超过16MB,通过拆分文档、使用GridFS或引用等方式。 注意:在MongoDB 3.6及更高版本中,错误代码10334用于BSON对象过大。 引用: [^1]: MongoDB官方文档 - BSON文档大小限制: https://docs.mongodb.com/manual/reference/limits/#bson-documents 下面生成相关问题:</think>### MongoDB错误10334:BSON对象过大解决方案 当MongoDB报错`BSONObjectTooLarge`(错误码10334)时,表示您尝试插入或更新的文档超过了BSON文档大小限制(**16MB**)。以下是系统化的解决方案: --- ### 一、诊断问题根源 1. **确认文档大小** 使用`Object.bsonSize()`检查文档大小: ```javascript // 在Mongo Shell中执行 db.collection.find({_id: <document_id>}).forEach(doc => { print("Document size:", Object.bsonSize(doc), "bytes"); }); ``` 若结果接近`16777216`(16MB),则触发限制[^1]。 2. **常见触发场景**: - 存储大型二进制文件(图片/视频) - 数组或嵌套对象无限增长 - 聚合操作结果过大 - 批量更新操作包含过多数据 --- ### 二、核心解决方案 #### ✅ 方案1:重构数据模型(推荐) ```mermaid graph LR A[超大文档] --> B{拆分策略} B --> C[分离大数组] B --> D[引用外部集合] B --> E[分桶存储] ``` - **拆分嵌套数组** 将大数组移至独立集合,使用引用关联: ```javascript // 原始文档(可能过大) { _id: 1, user_logs: [/* 10万条记录 */] } // 拆分后 // users集合 { _id: 1 } // logs集合 { user_id: 1, log: "2023-01-01 action", ... } ``` - **分桶技术(Bucketing)** 对时序数据分段存储: ```javascript // 原始文档(全年数据) { sensor_id: 1, year: 2023, readings: [/* 365天数据 */] } // 分桶后(按月存储) { sensor_id: 1, month: "2023-01", readings: [/* 31天数据 */] } ``` #### ✅ 方案2:使用GridFS存储大文件 适用于图像/视频等二进制数据: ```bash # 通过mongofiles上传 mongofiles -d mydb put large_video.mp4 ``` ```python # Python示例(PyMongo) from gridfs import GridFS with open("large_file.zip", "rb") as f: fs = GridFS(db) file_id = fs.put(f, filename="large_file.zip") ``` #### ✅ 方案3:优化聚合操作 当聚合管道结果超过16MB时: 1. 添加`$limit`和`$project`阶段减少输出 2. 启用`allowDiskUse`: ```javascript db.collection.aggregate([...], { allowDiskUse: true }) ``` 3. 使用`$out`分阶段写入结果集: ```javascript [ { $match: {...} }, { $group: {...} }, { $out: "intermediate_results" } // 分阶段存储 ] ``` --- ### 三、高级场景处理 #### 🔧 批量更新导致过大 使用分批更新(Bulk Write): ```javascript const bulkOps = []; for (let i = 0; i < bigArray.length; i += 1000) { bulkOps.push({ updateMany: { filter: { ... }, update: { $push: { logs: { $each: bigArray.slice(i, i+1000) } } } } }); } db.collection.bulkWrite(bulkOps); // 分批执行 ``` #### 🚫 禁止的操作 - **无效方案**:修改`maxBsonObjectSize`配置(MongoDB固定限制) - **危险操作**:使用`$out`覆盖系统集合(可能导致崩溃) --- ### 四、最佳实践 1. **设计阶段预防**: - 嵌套文档深度 ≤ 3层 - 数组元素数量监控: ```javascript db.collection.aggregate([ { $project: { arrayCount: { $size: "$myArray" } } }, { $match: { arrayCount: { $gt: 10000 } } } ]) ``` 2. **压缩文本数据**:在应用层压缩JSON文本(如LZ4算法) 3. **定期归档**:将历史数据移至冷存储(如S3) > ⚠️ **注意**:MongoDB 6.0+版本针对分片集群优化了大文档处理,但单文档16MB限制不变[^2]。 --- ### 相关问题 1. MongoDB中如何监控文档大小增长趋势? 2. GridFS与普通文件系统存储的优缺点比较? 3. 分桶技术(Bucketing)在时序数据建模中的具体实现? 4. 如何设计MongoDB模式以预防BSON对象过大问题? 5. MongoDB聚合管道`allowDiskUse`选项的性能影响分析? [^1]: MongoDB官方文档 - BSON文档大小限制 [^2]: MongoDB 6.0 分片集群改进说明
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值