MongoDB Replica Sets 架构(自动故障转移/读写分离实践)

本文介绍MongoDB副本集的部署步骤与故障转移测试,包括环境搭建、配置初始化及故障转移过程,展示了副本集如何确保数据一致性和系统高可用。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

转自:

http://www.itnose.net/detail/6193518.html


说明:该篇内容部分来自红丸编写的MongoDB实战文章。

1、简介

MongoDB支持在多个机器中通过异步复制达到故障转移和实现冗余,多机器中同一时刻只有一台是用于写操作,正是由于这个情况,为了MongoDB提供了数据一致性的保障,担当primary角色的服务能把读操作分发给Slave(详情请看前两篇关于Replica Set成员组成和理解)。

MongoDB高可用分为两种:

  • Master-Slave主从复制:只需要在某一个服务启动时加上-master参数,而另外一个服务加上-slave与-source参数,即可实现同步,MongoDB的最新版本已经不在推荐此方案。在官网的文档中有如下一段提醒:

    IMPORTANT

    Replica sets replace master-slave replication for most use cases. If possible, use replica sets rather than master-slave replication for all new production deployments. This documentation remains to support legacy deployments and for archival purposes only.

         意思就是说在很多的案例中已经用Replica Set来替代Master-slave。

  • Replica Set复制集:MongoDB在1.6版本后开发了新功能Replica Set,这比之前的Replication功能要强大一些,增加了故障自动切换和自动修复成员节点,各个DB之间数据完全一致,大大降低了维护难度,auto shard已经明确说明不支持replication paris,建议使用Replica Set,故障完全自动切换。

    2、实践架构
    MongoDB的Replica Set的架构非常类似一个集群,是的,你完全可以把它当做集群,因为它却是跟集群实现的作用是一样的,其中一个节点故障,其他的节点马上会将业务接过来而无须停机操作,在此实践中就选择MongoDB最常用的3个成员架构。
    第五部分 架构篇 第十四章 MongoDB Replica Sets 架构(自动故障转移/读写分离实践)0 

    3、部署Replica Set
    接下来将一步一步的给来实施该架构的部署
  • 环境准备
         系统环境:CentOS 6.4 64 bit(一台虚拟机)
         MongoDB版本:MongoDB 2.6版本
  • 步骤
    创建数据存储目录:
    [root@localhost mongodb]# mkdir -p r0
    [root@localhost mongodb]# mkdir -p r1
    [root@localhost mongodb]# mkdir -p r2
    创建日志文件路径:
    [root@localhost mongodb]# mkdir -p log
    创建主从key文件,用于标识集群的私钥的完整路径,如果各个实例的keyfile内容不一致,程序将不能正常启动。 
    [root@localhost mongodb]# mkdir -p key
    [root@localhost mongodb]# echo "this is rs1 super secret key">key/r0
    [root@localhost mongodb]# echo "this is rs1 super secret key">key/r1
    [root@localhost mongodb]# echo "this is rs1 super secret key">key/r2
    [root@localhost mongodb]# chmod 600 key/r*
    [root@localhost mongodb]# 
    启动三个实例:
    [root@localhost bin]# ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r0 --fork --port 28010 --dbpath=/usr/local/mongodb/r0 --logpath=/usr/local/mongodb/log/r0.log --logappend
    about to fork child process, waiting until server is ready for connections.
    forked process: 2545
    [root@localhost bin]# ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r1 --fork --port 28011 --dbpath=/usr/local/mongodb/r1 --logpath=/usr/local/mongodb/log/r1.log --logappend
    about to fork child process, waiting until server is ready for connections.
    forked process: 2596
    [root@localhost bin]# ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r2 --fork --port 28012 --dbpath=/usr/local/mongodb/r2 --logpath=/usr/local/mongodb/log/r2.log --logappend
    about to fork child process, waiting until server is ready for connections.
    forked process: 2602
    说明:三个实例端口分别为28010、28011、28012 数据存放文件分别为r0、r1、r2。
    配置以及初始化Replica Sets
    [root@localhost bin]# ./mongo --port 28010
    MongoDB shell version: 2.6.6
    connecting to: 127.0.0.1:28010/test
    > config_rs1={_id:"rs1",members:[{_id:0,host:'localhost:28010',priority:1},{_id:1,host:'localhost:28011'},{_id:2,host:'localhost:28012'}]}
    {
            "_id" : "rs1",
            "members" : [
                    {
                            "_id" : 0,
                            "host" : "localhost:28010",
                            "priority" : 1
                    },
                    {
                            "_id" : 1,
                            "host" : "localhost:28011"
                    },
                    {
                            "_id" : 2,
                            "host" : "localhost:28012"
                    }
            ]
    }
    > 
    说明:指定每个阶段的IP和端口,priority=1作用将端口28010设置为primary。 
    > rs.initiate(config_rs1);
    {
            "info" : "Config now saved locally.  Should come online in about a minute.",
            "ok" : 1
    }
    > 
    
    查看复制集的状态:
    rs1:OTHER>  rs.status();
    {
            "set" : "rs1",
            "date" : ISODate("2015-01-16T03:10:41Z"),
            "myState" : 2,
            "members" : [
                    {
                            "_id" : 0,
                            "name" : "localhost:28010",
                            "health" : 1,
                            "state" : 2,
                            "stateStr" : "SECONDARY",
                            "uptime" : 260,
                            "optime" : Timestamp(1421377833, 1),
                            "optimeDate" : ISODate("2015-01-16T03:10:33Z"),
                            "self" : true
                    },
                    {
                            "_id" : 1,
                            "name" : "localhost:28011",
                            "health" : 1,
                           <span style="background-color: rgb(255, 0, 0);"><span style="color:#ff0000;"> </span>"state" : 5,
                            "stateStr" : "STARTUP2",</span>
                            "uptime" : 8,
                            "optime" : Timestamp(0, 0),
                            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                            "lastHeartbeat" : ISODate("2015-01-16T03:10:39Z"),
                            "lastHeartbeatRecv" : ISODate("2015-01-16T03:10:39Z"),
                            "pingMs" : 0,
                            "<span style="background-color: rgb(255, 0, 0);">lastHeartbeatMessage" : "initial sync need a member to be primary or secondary to do our initial sync"</span>
                    },
                    {
                            "_id" : 2,
                            "name" : "localhost:28012",
                            "health" : 1,
                            "state" : 5,
                            "stateStr" : "STARTUP2",
                            "uptime" : 8,
                            "optime" : Timestamp(0, 0),
                            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                            "lastHeartbeat" : ISODate("2015-01-16T03:10:39Z"),
                            "lastHeartbeatRecv" : ISODate("2015-01-16T03:10:40Z"),
                            "pingMs" : 0,
                            <span style="background-color: rgb(255, 0, 0);">"lastHeartbeatMessage" : "initial sync need a member to be primary or secondary to do our initial sync"</span>
                    }
            ],
            "ok" : 1
    }

    说明:在name为localhost:28010的节点的stateStr为SECONDARY,这是为什么呢?我们结合下面红色字体标注的地方来看,在调用rs.initiatie初始化Replica Set配置时,里面的提示信息为:
    Should come online in about a minute,也就是这个过程需要花费大约一分钟,执行该方法后,命令行却已经结束,此时立马使用rs.status方法查询时,就出现上述代码的问题,此时系统还在初始化,primary和Secondary还并没有完全定义完,此时的三个节点状态为2或者5.state=5的通过lastHeartbeatMessage可以查出正在同步中。
    过一会再次执行rs.status()方法查看状态:
    rs1:PRIMARY> rs.status();
    {
            "set" : "rs1",
            "date" : ISODate("2015-01-16T03:13:09Z"),
            "myState" : 1,
            "members" : [
                    {
                            "_id" : 0,
                            "name" : "localhost:28010",
                            "health" : 1,
                            "state" : 1,
                            "stateStr" : "PRIMARY",
                            "uptime" : 408,
                            "optime" : Timestamp(1421377833, 1),
                            "optimeDate" : ISODate("2015-01-16T03:10:33Z"),
                            "electionTime" : Timestamp(1421377841, 1),
                            "electionDate" : ISODate("2015-01-16T03:10:41Z"),
                            "self" : true
                    },
                    {
                            "_id" : 1,
                            "name" : "localhost:28011",
                            "health" : 1,
                            "state" : 2,
                            "stateStr" : "SECONDARY",
                            "uptime" : 156,
                            "optime" : Timestamp(1421377833, 1),
                            "optimeDate" : ISODate("2015-01-16T03:10:33Z"),
                            "lastHeartbeat" : ISODate("2015-01-16T03:13:09Z"),
                            "lastHeartbeatRecv" : ISODate("2015-01-16T03:13:07Z"),
                            "pingMs" : 1,
                            "syncingTo" : "localhost:28010"
                    },
                    {
                            "_id" : 2,
                            "name" : "localhost:28012",
                            "health" : 1,
                            "state" : 2,
                            "stateStr" : "SECONDARY",
                            "uptime" : 156,
                            "optime" : Timestamp(1421377833, 1),
                            "optimeDate" : ISODate("2015-01-16T03:10:33Z"),
                            "lastHeartbeat" : ISODate("2015-01-16T03:13:08Z"),
                            "lastHeartbeatRecv" : ISODate("2015-01-16T03:13:09Z"),
                            "pingMs" : 0,
                            "syncingTo" : "localhost:28010"
                    }
            ],
            "ok" : 1
    }
    此时Replica Set已经初始化完成,各个节点状态均以正常,state=1的为primary服务。state=2的为SECONDARY服务节点。两个SECONDARY状态的阶段都是通过28010端口同步数据,通过syncingTo字段可以看出。
    参数说明:
    _id:唯一键
    name:主机名称端口
    health:健康状态,1为健康
    state:服务状态 1为PRIMARY 2为SECONDARY,还有其他在后续会讲解,如5为同步中
    stateStr:状态描述
    optime:操作时间
    optimeDate:操作日期
    electionTime:选举成员时间
    electionDate:选举成员日期
    lastHearbeat:最后心跳时间
    lastHearbeatRecv:最后心跳接收时间
    pingMs:mongod服务状态
    syncingTo:同步数据源头

    还可以用isMaster查看Replica Sets状态。
    rs1:PRIMARY> rs.isMaster();
    {
            "setName" : "rs1",
            "setVersion" : 1,
            "ismaster" : true,
            "secondary" : false,
            "hosts" : [
                    "localhost:28010",
                    "localhost:28012",
                    "localhost:28011"
            ],
            "primary" : "localhost:28010",
            "me" : "localhost:28010",
            "maxBsonObjectSize" : 16777216,
            "maxMessageSizeBytes" : 48000000,
            "maxWriteBatchSize" : 1000,
            "localTime" : ISODate("2015-01-16T02:38:58.479Z"),
            "maxWireVersion" : 2,
            "minWireVersion" : 0,
            "ok" : 1
    }
    rs1:PRIMARY> 

    3.1、主从操作日志oplog 
    MongoDB的Replica Set架构是通过一个日志来存储写操作的,这个日志叫做oplog,在前面的教程中已经学习过了,oplog.rs是一个固定长度的capped collection,它存在于local数据库中,用于记录Replica Sets的操作日志,在默认情况下,对于64位的MongoDB,oplog是比较大的,可以达到5%的磁盘空间,oplog的大小可以通过mongod的参数--oplogSize来改变oplog的日志大小。
    rs1:PRIMARY> use local
    switched to db local
    rs1:PRIMARY> show collections
    me
    oplog.rs
    startup_log
    system.indexes
    system.replset
    rs1:PRIMARY> \
    rs1:PRIMARY> db.oplog.rs.find();
    { "ts" : Timestamp(1421375729, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } }
    rs1:PRIMARY> 

    字段说明:
    ts:某个操作的时间戳
    op:操作类型:如下:
    i:insert
    d:delete
    u:update
    ns:命名空间,也就是操作的collection name
    o:doucment的内容
    查看master的oplog的元数据信息:
    rs1:PRIMARY> db.printReplicationInfo();
    configured oplog size:   990MB
    log length start to end: 0secs (0hrs)
    oplog first event time:  Fri Jan 16 2015 10:35:29 GMT+0800 (CST)
    oplog last event time:   Fri Jan 16 2015 10:35:29 GMT+0800 (CST)
    now:                     Fri Jan 16 2015 10:45:18 GMT+0800 (CST)
    rs1:PRIMARY> 
    字段说明:
    configured oplog size:配置的oplog文件大小。
    log length start to end:oplog日志的启用时间段。
    oplog first event time:第一个事务日志的产生时间。
    oplog last event time:最后一个事务日志的产生时间。
    now:现在的时间值。
    查看slave的同步状态:
    rs1:PRIMARY> db.printSlaveReplicationInfo();
    source: localhost:28011
            syncedTo: Thu Jan 01 1970 08:00:00 GMT+0800 (CST)
            1421375729 secs (394826.59 hrs) behind the primary 
    source: localhost:28012
            syncedTo: Thu Jan 01 1970 08:00:00 GMT+0800 (CST)
            1421375729 secs (394826.59 hrs) behind the primary 
    rs1:PRIMARY> 
    字段说明:
    source:从库的IP以及端口
    syncedTo:目前的同步情况,延迟了多久等信息。

    3.2、主从配置信息
    在local库中不仅有主从日志oplog集合,还有一个集合用于记录主从配置信息:system.replset
    rs1:PRIMARY> db.system.replset.find();
    { "_id" : "rs1", "version" : 1, "members" : [ { "_id" : 0, "host" : "localhost:28010" }, { "_id" : 1, "host" : "localhost:28011" }, { "_id" : 2, "host" : "localhost:28012" } ] }
    rs1:PRIMARY> 
    
    从这个集合中可以看出,Replica Sets的配置信息,也可以在任何一个成员实例上执行rs.conf()来查看配置信息。


    3.3、Replica set测试
    写操作和查询操作测试
    分别从28010、28011、28012端口进行插入数据操作
    28010操作如下:
    [root@localhost bin]# ./mongo --port 28010
    MongoDB shell version: 2.6.6
    connecting to: 127.0.0.1:28010/test
    rs1:PRIMARY> db.student.insert({name:"zhangsan",age:20});
    WriteResult({ "nInserted" : 1 })
    <span style="background-color: rgb(255, 0, 0);">rs1:PRIMARY</span>> db.student.find();
    { "_id" : ObjectId("54b87ca7f663c819d621d590"), "name" : "zhangsan", "age" : 20 }
    rs1:PRIMARY> 
    28011操作如下:
    [root@localhost bin]# ./mongo --port 28011
    MongoDB shell version: 2.6.6
    connecting to: 127.0.0.1:28011/test
    rs1:SECONDARY> show collections
    2015-01-16T11:22:54.517+0800 error: { "$err" : "<span style="background-color: rgb(255, 102, 102);">not master and slaveOk=false</span>", "code" : 13435 } at src/mongo/shell/query.js:131
    当查询的时候报错了,说明是个从库不能执行查询操作。
    此时应该让从库可读,通过setSlaveOk()方法即可让其可读
    rs1:SECONDARY> db.getMongo().setSlaveOk();
    <span style="background-color: rgb(255, 0, 0);">rs1:SECONDARY</span>> show collections;
    student
    system.indexes
    rs1:SECONDARY> 
    此时便可以进行查询操作了。 
    在此要注意下连接到mongod服务之后,命令行开头变成了rs1.SECONDARY和rs1.PRIMARY,说明当前登录的rs1这个复制集得PRIMARY节点或者SECONDARY的节点。
    此时查询student的数据:
    rs1:SECONDARY> db.student.find();
    { "_id" : ObjectId("54b883fb7bd891605d9c300f"), "name" : "zhangsan", "age" : 20 }
    rs1:SECONDARY> 
    28012端口操作如下:
    [root@localhost bin]# ./mongo --port 28012
    MongoDB shell version: 2.6.6
    connecting to: 127.0.0.1:28012/test
    rs1:SECONDARY> show collections
    2015-01-16T11:27:04.747+0800 error: { "$err" : "not master and slaveOk=false", "code" : 13435 } at src/mongo/shell/query.js:131
    rs1:SECONDARY> db.getMongo().setSlaveOk();
    rs1:SECONDARY> show collections
    student
    system.indexes
    rs1:SECONDARY> db.student.find();
    { "_id" : ObjectId("54b883fb7bd891605d9c300f"), "name" : "zhangsan", "age" : 20 }
    rs1:SECONDARY> 
    在28011端口上进行写操作:
    [root@localhost bin]# ./mongo --port 28011
    MongoDB shell version: 2.6.6
    connecting to: 127.0.0.1:28011/test
    rs1:SECONDARY> db.student.insert({name:"lisi",age:20});
    WriteResult({ "writeError" : { "code" : undefined, "errmsg" : "not master" } })
    此时提示不是master不能进行写操作,这跟前面两章节详细讲解Replica Set架构的相关原理相符合。
    同样在28012端口也是如此,验证了Replica Set只有PRIMARY才能接收所有的写操作,SECONDARY最多也就只有读操作,还需要通过db.getMongo().setSlaveOk()来进行设置才可以。

    故障转移
    复制集比传统的Master-Slave有改进的地方就是他可以进行故障自动转移,如果我们停掉复制集中的一个成员,那么剩下成员会再自动选举一个成员作为PRIMARY,比如我们现在将当前的28010这个PRIMARY停掉,通过使用kill -2 PID的方式,如下:
    bye
    [root@localhost bin]# <span style="color:#ff0000;">ps aux|grep mongod</span>
    root     <span style="background-color: rgb(255, 0, 0);"> 6658 </span> 0.8  3.7 3175956 37508 ?       Sl   11:06   0:12 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r0 --fork --port <span style="background-color: rgb(255, 0, 0);">28010</span> --dbpath=/usr/local/mongodb/r0 --logpath=/usr/local/mongodb/log/r0.log --logappend
    root      7461  0.7  3.7 3144172 37764 ?       Sl   11:06   0:11 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r1 --fork --port 28011 --dbpath=/usr/local/mongodb/r1 --logpath=/usr/local/mongodb/log/r1.log --logappend
    root     28166  0.6  3.8 3144152 38520 ?       Sl   11:10   0:08 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r2 --fork --port 28012 --dbpath=/usr/local/mongodb/r2 --logpath=/usr/local/mongodb/log/r2.log --logappend
    root     30833  0.0  0.0 103244   832 pts/1    S+   11:31   0:00 grep mongod
    [root@localhost bin]# <span style="background-color: rgb(255, 0, 0);">kill -2 6658</span>
    [root@localhost bin]# ps aux|grep mongod
    root      7461  0.7  3.7 3158520 37960 ?       Sl   11:06   0:11 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r1 --fork --port 28011 --dbpath=/usr/local/mongodb/r1 --logpath=/usr/local/mongodb/log/r1.log --logappend
    root     28166  0.6  3.8 3154396 38616 ?       Sl   11:10   0:08 ./mongod --replSet rs1 --keyFile=/usr/local/mongodb/key/r2 --fork --port 28012 --dbpath=/usr/local/mongodb/r2 --logpath=/usr/local/mongodb/log/r2.log --logappend
    root     30869  0.0  0.0 103244   832 pts/1    S+   11:31   0:00 grep mongod
    [root@localhost bin]# 
    此时通过28011端口连接mongod服务并查看复制集状态
    rs1:PRIMARY> rs.status()
    {
            "set" : "rs1",
            "date" : ISODate("2015-01-16T03:33:15Z"),
            "myState" : 1,
            "members" : [
                    {
                            "_id" : 0,
                            "name" : "<span style="background-color: rgb(255, 0, 0);">localhost:28010</span>",
                            "health" : 0,
                            <span style="background-color: rgb(255, 0, 0);">"state" : 8,</span>
                            <span style="background-color: rgb(255, 0, 0);">"stateStr" : "(not reachable/healthy)",</span>
                            "uptime" : 0,
                            "optime" : Timestamp(1421378555, 1),
                            "optimeDate" : ISODate("2015-01-16T03:22:35Z"),
                            "lastHeartbeat" : ISODate("2015-01-16T03:33:14Z"),
                            "lastHeartbeatRecv" : ISODate("2015-01-16T03:31:50Z"),
                            "pingMs" : 0
                    },
                    {
                            "_id" : 1,
                            "name" : <span style="color:#ff0000;">"localhost:28011</span>",
                            "health" : 1,
                            "state" : 1,
                            <span style="background-color: rgb(255, 0, 0);">"stateStr" : "PRIMARY"</span>,
                            "uptime" : 1608,
                            "optime" : Timestamp(1421378555, 1),
                            "optimeDate" : ISODate("2015-01-16T03:22:35Z"),
                            "electionTime" : Timestamp(1421379114, 1),
                            "electionDate" : ISODate("2015-01-16T03:31:54Z"),
                            "self" : true
                    },
                    {
                            "_id" : 2,
                            "name" : "localhost:28012",
                            "health" : 1,
                            "state" : 2,
                            "stateStr" : "SECONDARY",
                            "uptime" : 1358,
                            "optime" : Timestamp(1421378555, 1),
                            "optimeDate" : ISODate("2015-01-16T03:22:35Z"),
                            "lastHeartbeat" : ISODate("2015-01-16T03:33:15Z"),
                            "lastHeartbeatRecv" : ISODate("2015-01-16T03:33:13Z"),
                            "pingMs" : 0,
                            "lastHeartbeatMessage" : "syncing to: localhost:28011",
                            "syncingTo" : "localhost:28011"
                    }
            ],
            "ok" : 1
    }
    rs1:PRIMARY> 
    此时28010的状态变为了8,描述为不可达。健康状态为0,28011的状态变为了1,描述为PRIMARY,此时的架构为如下所示:
    第五部分 架构篇 第十四章 MongoDB Replica Sets 架构(自动故障转移/读写分离实践)1
    通过上述测试,系统在28010服务挂掉时,系统自动选举了28011端口作为PRIMARY服务,所以这样的故障处理机制,能将系统的稳定性大大的提高。
    此时便可以在28011上进行写入操作了,28012上仅有读取操作(在此不列出)。
    rs1:PRIMARY> use test
    switched to db test
    rs1:PRIMARY> db.student.insert({name:"lisi",age:20});
    WriteResult({ "nInserted" : 1 })
    rs1:PRIMARY> db.student.find();
    { "_id" : ObjectId("54b883fb7bd891605d9c300f"), "name" : "zhangsan", "age" : 20 }
    { "_id" : ObjectId("54b8876aad5e04c1fe460154"), "name" : "lisi", "age" : 20 }
    rs1:PRIMARY> 
    
    基于Replica Set部署和测试在此章节到此结束,在后续的章节继续讲解Replica Set动态添加、删除节点的相关内容,在该章节中主要讲解Replica Set部署过程,以及故障自动转移和读写分离的相关测试和原理。
    --------------------------------------------MongoDB系列博文更新---------------------------
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值