GridFS

最新推荐文章于 2025-08-20 16:19:32 发布

原创最新推荐文章于 2025-08-20 16:19:32 发布 · 834 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#object #collections #file #upload #command #structure

MongoDB 专栏收录该内容

4 篇文章

订阅专栏

When to use GridFS

This page is under construction

When to use GridFS

Lots of files. GridFS tends to handle large numbers (many thousands) of files better than many file systems.
User uploaded files. When users upload files you tend to have a lot of files, and want them replicated and backed up. GridFS is a perfect place to store these as then you can manage them the same way you manage your data. You can also query by user, upload date, etc... directly in the file store, without a layer of indirection
Files that often change. If you have certain files that change a lot - it makes sense to store them in GridFS so you can modify them in one place and all clients will get the updates. Also can be better than storing in source tree so you don't have to deploy app to update files.

When not to use GridFS

Few small static files. If you just have a few small files for a website (js,css,images) its probably easier just to use the file system.
Note that if you need to update a binary object atomically, and the object is under the document size limit for your version of MongoDB (16MB for 1.8), then you might consider storing the object manually within a single document. This can be accomplished using the BSON bindata type. Check your driver's docs for details on using this type.

File Tools

mongofiles is a tool for manipulating GridFS from the command line.

Introduction

It works by splitting large object into small chunks, usually 256k in size. (把一个文件切分成小块儿存在mongo的collection里)

Specification

Storage Collections

GridFS uses two collections to store data:

files contains the object metadata
chunks contains the binary chunks with some additional accounting information

the files and chunks collections are named with a prefix. (prefix相当于逻辑的文件系统）By default the prefix is fs.,

Here's an example of the standard GridFS interface in Java:

/*
 * default root collection usage - must be supported
 */
GridFS myFS = new GridFS(myDatabase);              // returns a default GridFS (e.g. "fs" root collection)
myFS.storeFile(new File("/tmp/largething.mpg"));   // saves the file into the "fs" GridFS store

/*
 * specified root collection usage - optional
 */

GridFS myContracts = new GridFS(myDatabase, "contracts");             // returns a GridFS where  "contracts" is root
myFS.retrieveFile("smithco", new File("/tmp/smithco_20090105.pdf"));  // retrieves object whose filename is "smithco"

files

Documents in the files collection require the following fields: 一个文件的metadata

{
  "_id" : <unspecified>,                  // unique ID for this file
  "length" : data_number,                 // size of the file in bytes
  "chunkSize" : data_number,              // size of each of the chunks.  Default is 256k
  "uploadDate" : data_date,               // date when object first stored
  "md5" : data_string                     // result of running the "filemd5" command on this file's chunks
}

chunks

The structure of documents from the chunks collection is as follows:

{
  "_id" : <unspecified>,         // object id of the chunk in the _chunks collection
  "files_id" : <unspecified>,    // _id of the corresponding files collection entry
  "n" : chunk_number,            // chunks are numbered in order, starting with 0
  "data" : data_binary,          // the chunk's payload as a BSON binary type
}

Indexes

GridFS implementations should create a unique, compound index in the chunks collection for files_id and n. Here's how you'd do that from the shell:

db.fs.chunks.ensureIndex({files_id:1, n:1}, {unique: true});

This way, a chunk can be retrieved efficiently using it's files_id and n values. Note that GridFS implementations should use findOne operations to get chunks individually, and should not leave open a cursor to query for all chunks. So to get the first chunk, we could do:

db.fs.chunks.findOne({files_id: myFileID, n: 0});

GridFS

GridFS Specification

When to use GridFS

When to use GridFS

When not to use GridFS

File Tools

Introduction

Storage Collections