mongodb - Endless recovering state of secondary -


i build replication set 1 primary, 1 secondary , 1 arbiter on mongodb 3.0.2. primary , arbiter on same host , secondary on host.

with growing of write overload, secondary can't follow primary , step state of recovering. primary can connect secondary can log secondary server mongo shell on host of primary.

i stop operations , watch secondary's state command rs.status() , type command rs.syncfrom("primary's ip:port") on secondary.

then result of rs.status() command shows optimedate of secondary far behind of primary , 1 message appears intermittently below:

"set" : "shard01", "date" : isodate("2015-05-15t02:10:55.382z"), "mystate" : 3, "members" : [ { "_id" : 0, "name" : "xxx.xxx.xxx.xxx:xxx", "health" : 1, "state" : 1, "statestr" : "primary", "uptime" : 135364, "optime" : timestamp(1431655856, 6), "optimedate" : isodate("2015-05-15t02:10:56z"), "lastheartbeat" : isodate("2015-05-15t02:10:54.306z"), "lastheartbeatrecv" : isodate("2015-05-15t02:10:53.634z"), "pingms" : 0, "electiontime" : timestamp(1431520398, 2), "electiondate" : isodate("2015-05-13t12:33:18z"), "configversion" : 3 }, { "_id" : 1, "name" : "xxx.xxx.xxx.xxx:xxx", "health" : 1, "state" : 7, "statestr" : "arbiter", "uptime" : 135364, "lastheartbeat" : isodate("2015-05-15t02:10:53.919z"), "lastheartbeatrecv" : isodate("2015-05-15t02:10:54.076z"), "pingms" : 0, "configversion" : 3 }, { "_id" : 2, "name" : "xxx.xxx.xxx.xxx:xxx", "health" : 1, "state" : 3, "statestr" : "recovering", "uptime" : 135510, "optime" : timestamp(1431602631, 134), "optimedate" : isodate("2015-05-14t11:23:51z"), "infomessage" : "could not find member sync from", "configversion" : 3, "self" : true } ], "ok" : 1

"infomessage" : "could not find member sync from"

the primary , arbiter both ok. want know reason of message , how change secondary's state "recovering" "secondary".

the problem (most likely)

the last operation on primary "2015-05-15t02:10:56z", whereas last operation of going secondary "2015-05-14t11:23:51z", difference of 15 hours. window may exceed replication oplog window (the difference between time of first , last operation entry in oplog). put simply, there many operations on primary secondary catch up.

a bit more elaborated (though simplified): during initial sync, data secondary syncs data of given point in time. when data of point in time synced over, secondary connects oplog , applies changes made between said point in time , according oplog entries. works long oplog holds operations between mentioned point in time. oplog has limited size (it called capped collection). if there more operations happening on primary oplog can hold during initial sync, oldest operations "fade out". secondary recognises not operations available necessary "construct" same data primary , refuses complete sync, staying in recovery mode.

the solution(s)

the problem known 1 , not bug, result of inner workings of mongodb , several fail-safe assumptions made development team. hence, there several ways deal situation. sadly, since have 2 data bearing nodes, involve downtime.

option 1: increase oplog size

this preferred method, since deals problem once , (kind of) all. it's bit more complicated other solutions, though. high level perspective, these steps take.

  1. shut down primary
  2. create backup of oplog using direct access data files
  3. restart mongod in standalone mode
  4. copy current oplog temporary collection
  5. delete current oplog
  6. recreate oplog desired size
  7. copy oplog entries temporary collection shiny new oplog
  8. restart mongod part of replica set

do not forget increase oplog of secondary before doing initial sync, since may become primary @ time in future!

for details, please read "change size of oplog" in tutorials regarding replica set maintenance.

option 2: shut down app during sync

if option 1 not viable, real other solution shut down application causing load on replica set, restart sync , wait complete. depending on amount of data transferred, calculate several hours.

a personal note

the oplog window problem known one. while replica sets , sharded clusters easy set mongodb, quite knowledge , bit of experience needed maintain them properly. not run important database complex setup without knowing basics - in case bad (tm) happens, might lead situation fubar.


Comments

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

php - Nothing but 'run(); ' when browsing to my local project, how do I fix this? -

php - How can I echo out this array? -