Leveldb官方说明文档

本文档详细介绍了LevelDB库的使用方法,包括数据库的创建、读写操作、批量更新、并发控制及迭代等功能,并提供了性能调优建议。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

The leveldb library provides apersistent key value store. Keys and values are arbitrary bytearrays. The keys are ordered within the key value store accordingto a user-specified comparator function.

 

Opening A Database

leveldb database has a namewhich corresponds to a file system directory. All of the contentsof database are stored in this directory. The following exampleshows how to open a database, creating it if necessary:

 

  #include <assert>
  #include "leveldb/include/db.h"

  leveldb::DB* db;
  leveldb::Options options;
  options.create_if_missing = true;
  leveldb::Status status = leveldb::DB::Open(options, "/tmp/testdb", &db);
  assert(status.ok());
  ...

Ifyou want to raise an error if the database already exists, add thefollowing line before the leveldb::DB::Open call:
  options.error_if_exists = true;

Status

You may have noticed the leveldb::Status type above.Values of this type are returned by most functionsin leveldb that may encounteran error. You can check if such a result is ok, and also print anassociated error message:

 

   leveldb::Status s = ...;
   if (!s.ok()) cerr << s.ToString() << endl;

Closing A Database

When you are done with a database, just delete the database object.Example:

 

  ... open the db as described above ...
  ... do something with db ...
  delete db;

Reads And Writes

The database provides PutDelete, and Get methods to modify/querythe database. For example, the following code moves the valuestored under key1 to key2.

  std::string value;
  leveldb::Status s = db->Get(leveldb::ReadOptions(), key1, &value);
  if (s.ok()) s = db->Put(leveldb::WriteOptions(), key2, value);
  if (s.ok()) s = db->Delete(leveldb::WriteOptions(), key1);

Atomic Updates

Note that if the process dies after the Put of key2 but before thedelete of key1, the same value may be left stored under multiplekeys. Such problems can be avoided by usingthe WriteBatch class toatomically apply a set of updates:

 

  #include "leveldb/include/write_batch.h"
  ...
  std::string value;
  leveldb::Status s = db->Get(leveldb::ReadOptions(), key1, &value);
  if (s.ok()) {
    leveldb::WriteBatch batch;
    batch.Delete(key1);
    batch.Put(key2, value);
    s = db->Write(leveldb::WriteOptions(), &batch);
  }

The WriteBatch holdsa sequence of edits to be made to the database, and these editswithin the batch are applied in order. Note that wecalled Delete before Put sothat if key1 isidentical to key2,we do not end up erroneously dropping the value entirely.

Apart from its atomicity benefits, WriteBatch may also be usedto speed up bulk updates by placing lots of individual mutationsinto the same batch.

Synchronous Writes

Bydefault, each write to leveldb isasynchronous: it returns after pushing the write from the processinto the operating system. The transfer from operating systemmemory to the underlying persistent storage happens asynchronously.The sync flagcan be turned on for a particular write to make the write operationnot return until the data being written has been pushed all the wayto persistent storage. (On Posix systems, this is implemented bycalling either fsync(...) or fdatasync(...) ormsync(...,MS_SYNC) beforethe write operation returns.)
  leveldb::WriteOptions write_options;
  write_options.sync = true;
  db->Put(write_options, ...);

Asynchronouswrites are often more than a thousand times as fast as synchronouswrites. The downside of asynchronous writes is that a crash of themachine may cause the last few updates to be lost. Note that acrash of just the writing process (i.e., not a reboot) will notcause any loss since even when sync isfalse, an update is pushed from the process memory into theoperating system before it is considered done.

Asynchronous writes can often be used safely. For example, whenloading a large amount of data into the database you can handlelost updates by restarting the bulk load after a crash. A hybridscheme is also possible where every Nth write is synchronous, andin the event of a crash, the bulk load is restarted just after thelast synchronous write finished by the previous run. (Thesynchronous write can update a marker that describes where torestart on a crash.)

WriteBatch provides analternative to asynchronous writes. Multiple updates may be placedin the same WriteBatch and appliedtogether using a synchronous write(i.e., write_options.sync is set totrue). The extra cost of the synchronous write will be amortizedacross all of the writes in the batch.

 

Concurrency

A database may only be opened by one process at a time.The leveldb implementationacquires a lock from the operating system to prevent misuse. Withina single process, the same leveldb::DB object may besafely used by multiple concurrent threads.

 

Iteration

The following example demonstrates how to print all key,value pairsin a database.

 

  leveldb::Iterator* it = db->NewIterator(leveldb::ReadOptions());
  for (it->SeekToFirst(); it->Valid(); it->Next()) {
    cout << it->key().ToString() << ": "  << it->value().ToString() << endl;
  }
  assert(it->status().ok());  // Check for any errors found during the scan
  delete it;

Thefollowing variation shows how to process just the keys in therange [start,limit):

 

  for (it->Seek(start);
       it->Valid() && it->key().ToString() < limit;
       it->Next()) {
    ...
  }

Youcan also process entries in reverse order. (Caveat: reverseiteration may be somewhat slower than forward iteration.)

 

  for (it->SeekToLast(); it->Valid(); it->Prev()) {
    ...
  }

Snapshots

Snapshots provide consistent read-only views over the entire stateof the key-value store. ReadOptions::snapshot may benon-NULL to indicate that a read should operate on a particularversion of the DB state. If ReadOptions::snapshot isNULL, the read will operate on an implicit snapshot of the currentstate.

Snapshots typically are created by the DB::GetSnapshot()method:

 

  leveldb::ReadOptions options;
  options.snapshot = db->GetSnapshot();
  ... apply some updates to db ...
  leveldb::Iterator* iter = db->NewIterator(options);
  ... read using iter to view the state when the snapshot was created ...
  delete iter;
  db->ReleaseSnapshot(options.snapshot);

Notethat when a snapshot is no longer needed, it should be releasedusing the DB::ReleaseSnapshot interface. This allows theimplementation to get rid of state that was being maintained justto support reading as of that snapshot.

A Write operation can also return a snapshot that represents thestate of the database just after applying a particular set ofupdates:

 

  leveldb::Snapshot* snapshot;
  leveldb::WriteOptions write_options;
  write_options.post_write_snapshot = &snapshot;
  leveldb::Status status = db->Write(write_options, ...);
  ... perform other mutations to db ...

  leveldb::ReadOptions read_options;
  read_options.snapshot = snapshot;
  leveldb::Iterator* iter = db->NewIterator(read_options);
  ... read as of the state just after the Write call returned ...
  delete iter;

  db->ReleaseSnapshot(snapshot);

Slice

The return value of the it->key() and it->value() callsabove are instances of the leveldb::Slice type. Slice isa simple structure that contains a length and a pointer to anexternal byte array. Returning a Slice is a cheaperalternative to returning a std::string since we do notneed to copy potentially large keys and values. Inaddition, leveldb methods do notreturn null-terminated C-style stringssince leveldb keys and values areallowed to contain '\0' bytes.

C++ strings and null-terminated C-style strings can be easilyconverted to a Slice:

 

   leveldb::Slice s1 = "hello";

   std::string str("world");
   leveldb::Slice s2 = str;

ASlice can be easily converted back to a C++ string:
   std::string str = s1.ToString();
   assert(str == std::string("hello"));

Becareful when using Slices since it is up to the caller to ensurethat the external byte array into which the Slice points remainslive while the Slice is in use. For example, the following isbuggy:

 

   leveldb::Slice slice;
   if (...) {
     std::string str = ...;
     slice = str;
   }
   Use(slice);

Whenthe if statementgoes out of scope, str willbe destroyed and the backing storage for slice willdisappear.

 

Comparators

The preceding examples used the default ordering function for key,which orders bytes lexicographically. You can however supply acustom comparator when opening a database. For example, supposeeach database key consists of two numbers and we should sort by thefirst number, breaking ties by the second number. First, define aproper subclass of leveldb::Comparator thatexpresses these rules:

 

  class TwoPartComparator : public leveldb::Comparator {
   public:
    // Three-way comparison function:
    //   if a < b: negative result
    //   if a > b: positive result
    //   else: zero result
    int Compare(const leveldb::Slice& a, const leveldb::Slice& b) const {
      int a1, a2, b1, b2;
      ParseKey(a, &a1, &a2);
      ParseKey(b, &b1, &b2);
      if (a1 < b1) return -1;
      if (a1 > b1) return +1;
      if (a2 < b2) return -1;
      if (a2 > b2) return +1;
      return 0;
    }

    // Ignore the following methods for now:
    const char* Name() { return "TwoPartComparator"; }
    void FindShortestSeparator(std::string*, const leveldb::Slice&) const { }
    void FindShortSuccessor(std::string*) const { }
  };

Nowcreate a database using this custom comparator:

 

  TwoPartComparator cmp;
  leveldb::DB* db;
  leveldb::Options options;
  options.create_if_missing = true;
  options.comparator = &cmp;
  leveldb::Status status = leveldb::DB::Open(options, "/tmp/testdb", &db);
  ...

Backwards compatibility

The result of the comparator's Name method is attached tothe database when it is created, and is checked on every subsequentdatabase open. If the name changes,the leveldb::DB::Open call willfail. Therefore, change the name if and only if the new key formatand comparison function are incompatible with existing databases,and it is ok to discard the contents of all existing databases.

You can however still gradually evolve your key format over timewith a little bit of pre-planning. For example, you could store aversion number at the end of each key (one byte should suffice formost uses). When you wish to switch to a new key format (e.g.,adding an optional third part to the keys processed byTwoPartComparator), (a) keep the samecomparator name (b) increment the version number for new keys (c)change the comparator function so it uses the version numbers foundin the keys to decide how to interpret them.

 

Performance

Performance can be tuned by changing the default values of thetypes defined in leveldb/include/options.h.

 

Block size

leveldb groupsadjacent keys together into the same block and such a block is theunit of transfer to and from persistent storage. The default blocksize is approximately 4096 uncompressed bytes. Applications thatmostly do bulk scans over the contents of the database may wish toincrease this size. Applications that do a lot of point reads ofsmall values may wish to switch to a smaller block size ifperformance measurements indicate an improvement. There isn't muchbenefit in using blocks smaller than one kilobyte, or larger than afew megabytes. Also note that compression will be more effectivewith larger block sizes.

 

Compression

Each block is individually compressed before being written topersistent storage. Compression is on by default since the defaultcompression method is very fast, and is automatically disabled foruncompressible data. In rare cases, applications may want todisable compression entirely, but should only do so if benchmarksshow a performance improvement:

 

  leveldb::Options options;
  options.compression = leveldb::kNoCompression;
  ... leveldb::DB::Open(options, name, ...) ....

Cache

The contents of the database are stored in a set of files in thefilesystem and each file stores a sequence of compressed blocks.If options.cache is non-NULL,it is used to cache frequently used uncompressed blockcontents.

 

  #include "leveldb/include/cache.h"

  leveldb::Options options;
  options.cache = leveldb::NewLRUCache(100 * 1048576);  // 100MB cache
  leveldb::DB* db;
  leveldb::DB::Open(options, name, &db);
  ... use the db ...
  delete db
  delete options.cache;

Notethat the cache holds uncompressed data, and therefore it should besized according to application level data sizes, without anyreduction from compression. (Caching of compressed blocks is leftto the operating system buffer cache, or anycustom Env implementationprovided by the client.)

When performing a bulk read, the application may wish to disablecaching so that the data processed by the bulk read does not end updisplacing most of the cached contents. A per-iterator option canbe used to achieve this:

 

  leveldb::ReadOptions options;
  options.fill_cache = false;
  leveldb::Iterator* it = db->NewIterator(options);
  for (it->SeekToFirst(); it->Valid(); it->Next()) {
    ...
  }

Key Layout

Note that the unit of disk transfer and caching is a block.Adjacent keys (according to the database sort order) will usuallybe placed in the same block. Therefore the application can improveits performance by placing keys that are accessed together neareach other and placing infrequently used keys in a separate regionof the key space.

For example, suppose we are implementing a simple file system ontop of leveldb.The types of entries we might wish to store are:

 

   filename -> permission-bits, length, list of file_block_ids
   file_block_id -> data

Wemight want to prefix filename keyswith one letter (say '/') and the file_block_id keyswith a different letter (say '0') so that scans over just themetadata do not force us to fetch and cache bulky filecontents.

 

Checksums

leveldb associates checksumswith all data it stores in the file system. There are two separatecontrols provided over how aggressively these checksums areverified:

 

  • ReadOptions::verify_checksums maybe set to true to force checksum verification of all data that isread from the file system on behalf of a particular read. Bydefault, no such verification is done.

     

  • Options::paranoid_checks maybe set to true before opening a database to make the databaseimplementation raise an error as soon as it detects an internalcorruption. Depending on which portion of the database has beencorrupted, the error may be raised when the database is opened, orlater by another database operation. By default, paranoid checkingis off so that the database can be used even if parts of itspersistent storage have been corrupted.

    If a database is corrupted (perhaps it cannot be opened whenparanoid checking is turned on), the leveldb::RepairDB functionmay be used to recover as much of the data as possible

     

Approximate Sizes

The GetApproximateSizes methodcan used to get the approximate number of bytes of file systemspace used by one or more key ranges.

 

   leveldb::Range ranges[2];
   ranges[0] = leveldb::Range("a", "c");
   ranges[1] = leveldb::Range("x", "z");
   uint64_t sizes[2];
   leveldb::Status s = db->GetApproximateSizes(ranges, 2, sizes);

Thepreceding call will set sizes[0] tothe approximate number of bytes of file system space used by thekey range [a..c) and sizes[1] tothe approximate number of bytes used by the keyrange [x..z).

 

Environment

All file operations (and other operating system calls) issued bythe leveldb implementation arerouted through a leveldb::Env object.Sophisticated clients may wish to provide theirown Env implementation to getbetter control. For example, an application may introduceartificial delays in the file IO paths to limit the impactof leveldbonother activities in the system.

 

  class SlowEnv : public leveldb::Env {
    .. implementation of the Env interface ...
  };

  SlowEnv env;
  leveldb::Options options;
  options.env = &env;
  Status s = leveldb::DB::Open(options, ...);

Porting

leveldb may beported to a new platform by providing platform specificimplementations of the types/methods/functions exportedby leveldb/port/port.h. Seeleveldb/port/port_example.h formore details.

In addition, the new platform may need a newdefault leveldb::Env implementation.See leveldb/util/env_posix.h foran example.

Other Information

Details about the leveldb implementation maybe found in the following documents:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值