cassandra backup

本文介绍了如何使用Cassandra的nodetool snapshot命令进行在线数据快照,以及如何备份和恢复这些快照。同时,文章还讨论了在某些操作系统/JVM组合下可能遇到的内存分配错误,并提供了解决方案。

Backing up data

Cassandra can snapshot data while online using nodetool snapshot. You can then back up those snapshots using any desired system, although leaving them where they are is probably the option that makes the most sense on large clusters. nodetool snapshot triggers a node-wide flush, so all data written before the execution of the snapshot command is contained within the snapshot.

With some combinations of operating system/jvm you may receive an error related to the inability to create a process during the snapshotting, such as this on Linux

Exception in thread "main" java.io.IOException: Cannot run program "ln": java.io.IOException: error=12, Cannot allocate memory

This is caused by the operating system trying to allocate the child "ln" process a memory space as large as the parent process (the cassandra server), even though it's not going to use it. So if you have a machine with 8GB of RAM and no swap, and you gave 6GB to the cassandra server, it will fail during this because the operating system wants 12 GB of virtual memory before allowing you to create the process.

This error can be worked around by either :

  • dropping the jna.jar file into Cassandra's lib folder (requires at least Cassandra 0.6.6)

OR

  • creating a swap file, snapshotting, removing swap file

OR

  • turning on "memory overcommit"

To restore a snapshot:

  1. shut down the node
  2. clear out the old commitlog and sstables
  3. move the sstables from the snapshot location to the live data directory.

Consistent backups

You can get an eventually consistent backup by snapshotting all node; no individual node's backup is guaranteed to be consistent but if you restore from that snapshot then clients will get eventually consistent behavior as usual.

There is no such thing as a consistent view of the data in the strict sense, except in the trivial case of writes with consistency level = ALL.

-------------------------------test---------------------------------

restore:
1.stop node
pgep -u `whoami` -f cassandra |xargs kill -9
2.clean commitlog and sstable
rm -f /var/lib/cassandra/commitlog/*
rm -f /dir/*
3.
cp /var/lib/cassandra/data/demo/snapshots/oxoxooxx/* /var/lib/cassandra/data/demo/
cp /var/lib/cassandra/data/demo/backups/* /var/lib/cassandra/data/demo/
4.
./cassandra
./cassandra-cli -h 192.168.100.110 -p 9160

[@more@]

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/23937368/viewspace-1051061/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/23937368/viewspace-1051061/

<think>我们将在Ubuntu系统的Docker中安装Cassandra。根据引用[4]和引用[2]中的信息,我们可以使用Docker Compose来部署Cassandra,或者直接使用Docker命令运行Cassandra容器。以下提供两种方法: ### 方法一:使用Docker Compose(推荐) 1. 创建`docker-compose.yml`文件,内容如下: ```yaml version: '3.8' services: cassandra: image: cassandra:4.0 # 使用官方Cassandra镜像,这里选择4.0版本 container_name: cassandra ports: - "9042:9042" # 暴露Cassandra的CQL默认端口 volumes: - cassandra-data:/var/lib/cassandra # 持久化数据 environment: - CASSANDRA_CLUSTER_NAME=MyCluster # 设置集群名称,引用[2]中提到可以通过环境变量配置 - CASSANDRA_DC=datacenter1 # 设置数据中心名称 - CASSANDRA_RACK=rack1 # 设置机架名称 - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch # 设置snitch networks: - cassandra-net volumes: cassandra-data: networks: cassandra-net: driver: bridge ``` 2. 启动服务: ```bash docker-compose up -d ``` ### 方法二:直接使用Docker命令 1. 拉取Cassandra镜像: ```bash docker pull cassandra:4.0 ``` 2. 运行容器: ```bash docker run --name cassandra \ -p 9042:9042 \ -v cassandra-data:/var/lib/cassandra \ -e CASSANDRA_CLUSTER_NAME=MyCluster \ -e CASSANDRA_DC=datacenter1 \ -e CASSANDRA_RACK=rack1 \ -e CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch \ -d cassandra:4.0 ``` ### 验证安装 1. 检查容器状态: ```bash docker ps ``` 2. 进入容器并连接cqlsh: ```bash docker exec -it cassandra cqlsh ``` 在cqlsh中执行以下命令查看集群信息: ```cql DESCRIBE CLUSTER; ``` ### 注意事项 - **数据持久化**:通过挂载卷(`cassandra-data`)来保存数据,避免容器删除后数据丢失。 - **网络配置**:如果需要多节点集群,需要配置容器间的网络通信,并设置种子节点(通过环境变量`CASSANDRA_SEEDS`)。 - **资源限制**:根据引用[2]中的环境变量,可以设置资源限制(如CPU、内存),在Docker中可以通过`--cpus`和`--memory`参数设置,例如: ```bash docker run ... --cpus=2 --memory=2048m ... cassandra:4.0 ``` ### 扩展:多节点集群部署 若需部署多节点Cassandra集群,可参考以下步骤(使用Docker Compose): ```yaml version: '3.8' services: cassandra-seed: image: cassandra:4.0 container_name: cassandra-seed environment: - CASSANDRA_CLUSTER_NAME=MyCluster - CASSANDRA_DC=datacenter1 - CASSANDRA_RACK=rack1 - CASSANDRA_SEEDS=cassandra-seed,cassandra-node1 volumes: - cassandra-seed-data:/var/lib/cassandra networks: cassandra-net: cassandra-node1: image: cassandra:4.0 container_name: cassandra-node1 environment: - CASSANDRA_CLUSTER_NAME=MyCluster - CASSANDRA_DC=datacenter1 - CASSANDRA_RACK=rack1 - CASSANDRA_SEEDS=cassandra-seed,cassandra-node1 volumes: - cassandra-node1-data:/var/lib/cassandra networks: cassandra-net: depends_on: - cassandra-seed volumes: cassandra-seed-data: cassandra-node1-data: networks: cassandra-net: driver: bridge ``` 启动命令: ```bash docker-compose up -d ``` ### 常见问题解决 - **端口冲突**:确保宿主机端口9042未被占用。 - **启动慢**:Cassandra第一次启动需要初始化,可能需要几分钟。 - **内存不足**:调整Docker容器的内存限制,至少分配2GB内存。 §§相关问题§§ 1. 如何配置Cassandra集群的跨数据中心复制? 2. Docker中如何扩展Cassandra集群节点? 3. 如何备份和恢复Docker容器中的Cassandra数据? 4. 如何监控Docker中运行的Cassandra实例? [^1]: Docker官方安装文档 [^2]: Cassandra环境变量配置参考 [^4]: Docker Compose部署参考</think>### 在Ubuntu系统的Docker中安装Cassandra的完整指南 #### 一、安装准备 1. **安装Docker引擎**(参考引用[1]) ```bash # 更新系统 sudo apt-get update # 安装必要依赖 sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common # 添加Docker官方GPG密钥 curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - # 添加Docker仓库 sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" # 安装Docker CE sudo apt-get update sudo apt-get install -y docker-ce ``` 2. **验证Docker安装** ```bash sudo docker run hello-world # 应显示欢迎信息 ``` #### 二、安装Cassandra(两种方法) ##### 方法1:直接使用Docker命令 ```bash # 拉取官方Cassandra镜像 sudo docker pull cassandra:4.0 # 运行Cassandra容器 sudo docker run --name my-cassandra \ -p 9042:9042 \ # 暴露CQL端口 -v /path/to/local/data:/var/lib/cassandra \ # 数据持久化 -d cassandra:4.0 ``` ##### 方法2:使用Docker Compose(推荐,参考引用[4]) 1. 创建`docker-compose.yml`文件: ```yaml version: '3.8' services: cassandra: image: cassandra:4.0 container_name: my-cassandra ports: - "9042:9042" # CQL查询端口 - "7000:7000" # 节点通信端口 volumes: - cassandra-data:/var/lib/cassandra # 数据卷 environment: - CASSANDRA_CLUSTER_NAME=MyCluster # 集群名称(参考引用[2]) - CASSANDRA_DC=datacenter1 - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch volumes: cassandra-data: ``` 2. 启动服务: ```bash sudo docker-compose up -d ``` #### 三、验证安装 1. **检查容器状态** ```bash sudo docker ps # 应显示RUNNING状态 ``` 2. **进入容器执行CQLSH** ```bash sudo docker exec -it my-cassandra cqlsh ``` 3. **在CQLSH中执行测试命令** ```sql CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; USE test; CREATE TABLE users (id UUID PRIMARY KEY, name TEXT); INSERT INTO users (id, name) VALUES (uuid(), 'Alice'); SELECT * FROM users; ``` #### 四、关键配置说明(参考引用[2]) | 环境变量 | 说明 | 示例值 | |----------|------|--------| | `CASSANDRA_CLUSTER_NAME` | 集群名称 | `MyCluster` | | `CASSANDRA_DC` | 数据中心名称 | `datacenter1` | | `CASSANDRA_RACK` | 机架名称 | `rack1` | | `CASSANDRA_NUM_TOKENS` | 虚拟节点数量 | `256` | | `CASSANDRA_SEEDS` | 种子节点 | `seed1,seed2` | #### 五、数据管理 1. **持久化存储**: - 使用`volumes`确保数据在容器重启后不丢失 - 本地路径示例:`-v /mnt/cassandra/data:/var/lib/cassandra` 2. **备份数据**: ```bash # 进入容器执行备份 sudo docker exec my-cassandra nodetool snapshot -t my_backup ``` #### 六、常见问题解决 1. **端口冲突**: - 确保9042和7000端口未被占用:`sudo netstat -tuln | grep 9042` 2. **启动失败**: ```bash # 查看容器日志 sudo docker logs my-cassandra # 常见修复命令 sudo docker-compose down # 停止容器 sudo docker volume prune # 清理无效卷 sudo docker-compose up -d --force-recreate # 重建容器 ``` 3. **内存不足**(参考引用[2]): - 在`docker-compose.yml`中添加资源限制: ```yaml deploy: resources: limits: memory: 2048M # 限制2GB内存 ``` #### 七、集群部署示例 ```yaml version: '3.8' services: cassandra-seed: image: cassandra:4.0 environment: - CASSANDRA_CLUSTER_NAME=MyCluster - CASSANDRA_SEEDS=cassandra-seed,cassandra-node1 cassandra-node1: image: cassandra:4.0 depends_on: - cassandra-seed environment: - CASSANDRA_CLUSTER_NAME=MyCluster - CASSANDRA_SEEDS=cassandra-seed,cassandra-node1 ``` > **注意**:生产环境建议至少3个节点组成集群,并配置`CASSANDRA_SEEDS`参数[^2]。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符  | 博主筛选后可见
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值