ZooKeeper Overview

最新推荐文章于 2025-10-21 22:09:11 发布

原创最新推荐文章于 2025-10-21 22:09:11 发布 · 909 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#each #database #components #structure #session #numbers

ZooKeeper是一款高性能的分布式协调服务，通过内存中的复制数据库实现数据一致性。它支持简单的API，包括创建、删除节点等操作，并能为分布式应用提供有序、快速的数据访问。其数据模型基于层次化的命名空间，能够支持临时节点和条件更新。

Design Goals

ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchal namespace

ZooKeeperdata is kept in-memory, which means ZooKeeper can achieve high throughput and low latency numbers.

ZooKeeper is replicated. Like the distributed processes it coordinates, ZooKeeper itself is intended to be replicated overa sets of hosts called an ensemble.

ZooKeeper Service

Clients connect(TCP) to a single ZooKeeper server. If the TCP connection to the server breaks, the client will connect to a different server.

ZooKeeper is ordered. ZooKeeper stamps each update with a number that reflects the order of all ZooKeeper transactions.

ZooKeeper is fast. It is especially fast in "read-dominant" workloads. ZooKeeper applications run on thousands of machines, and it performs best where reads are more common than writes, at ratios of around 10:1.

Data model and the hierarchical namespace

ZooKeeper's Hierarchical Namespace

Nodes and ephemeral nodes

Unlike is standard file systems, each node in a ZooKeeper namespace can have data associated with it as well as children.

Znodes maintain a stat structure that includes

version numbers for data changes,

ACL changes,

and timestamps, to allow cache validations and coordinated updates. Each time a znode's data changes, the version number increases.

For instance, whenever a client retrieves data it also receives the version of the data.

The data stored at each znode in a namespace is read and written atomically.

ZooKeeper also has the notion of ephemeral(临时） nodes. These znodes exists as long as the session that created the znode is active. When the session ends the znode is deleted.

Conditional updates and watches

Clients can set a watch on a znodes. A watch will be triggered and removed when the znode changes. When a watch is triggered the client receives a packet saying that the znode has changed. And if the connection between the client and one of the Zoo Keeper servers is broken, the client will receive a local notification. These can be used to [tbd].

Simple API

it supports only these operations:

create

creates a node at a location in the tree

delete

deletes a node

exists

tests if a node exists at a location

get data

reads the data from a node

set data

writes data to a node

get children

retrieves a list of children of a node

sync

waits for data to be propagated //

Implementation

ZooKeeper Components shows the high-level components of the ZooKeeper service.

ZooKeeper Components

The replicated database is an in-memory database containing the entire data tree. Updates are logged to disk for recoverability, andwrites are serialized to disk before they are applied to the in-memory database.

As part of the agreement protocol all write requests from clients are forwarded to a single server, called the leader. The rest of the ZooKeeper servers, called followers, receive message proposals from the leader and agree upon message delivery. The messaging layer takes care of replacing leaders on failures and syncing followers with leaders.