MongoDB works as – Basically Available, Strongly Consistent ,  Partition Tolerant DB !

Its very important to understand MongoDB will not be 100% available when  a DC is down if QUORUM is not satisfied in order to gracefully manage network partition and guarantee strong consistency of data.

Fault Tolerance Table :

Number of Members.

Majority Required to Elect a New Primary.

Fault Tolerance.










StackOverflow Discussion :

Lets assume  DC1 has 2 nodes and DC2 has 3 nodes :

if DC1 is down , DC2 will work fine ( as 3 out of 5 are pingable)  , 

but if DC2 is down , (only 2 out of 5 are pingable) , DC1 will not select  a primary automatically  as Quorum rule will break and we need to manually force a primary and activate an arbiter

Lets dive deeper into CAP theorem to analyze the above use case !

1.  Consistency :  Each client always has the same view of the data. So you can read or write to/from any node and get the same data across the cluster .

How  does MongoDB ensure Consistency ?

(i) Mongo will always write to the Primary first

(ii) Mongo will always enable journaling to survive power failure

(iii) we can configure mongo-client to ensure that write operations succeed on all members before completing successfully

 (set write-concern as replica set, w=2)


2.  Availability :  Ability to access the cluster even if a node in the cluster goes down . All mongo-cleints can always read and write.

How  does MongoDB ensure Availability ?

 Replica sets provide high availability .  If the unavailable mongod is a primary, then the replica set will elect a new primary based on the fault-tolerance table (mentioned earlier)

set read_preference : primary_preferred  ( try reading from primary , if not available then read from secondary )

3. Partition Tolerance : cluster continues to function even if there is a “partition” (communications break) between two nodes (both nodes are up, but can’t communicate).

How does MongoDB survive Network Partition ?

(i) automatically select a primary when a primary is down and quorum is satisfied

(ii) auto resync when a node rejoins replica set i.e. becomes pingable again !

(iii) auto shard sync-up

(iv) if a primary is down or majority down and quorum is not satisfied , then primary will not be elected and current primary (if any) will step down as secondary.

Now let consider this scenario :  suddenly N/W communication breaks between Node X and Node Y, so they can’t synch updates .

At this point a DB can either :

A) allow the nodes to get out of sync (giving up consistency – don’t have close affinity with participating nodes),  or

B) consider the cluster to be “down” (giving up availability – very close affinity …  unable to ping peers )

MongoDB prefers option B)  and lets the primary step down as secondary ( still allowing clients to query data)

Why ?   Its the age-old confusion :   “if something is not pingable is it down?”  

Ref :  >> most of the CP systems implement Quorum as the basis for ‘Leader Selection Algo’

>> A quorum is the minimum number of votes that a distributed transaction has to obtain to enforce consistent operation in a distributed system.

Out of 3 nodes only 1 is pingable  =>  so primary can’t ping majority (I.e. at least 2 nodes)

> so PRIMARY doesn’t know

    — if other nodes are all still up ? 

    — is it the only primary or just a cut-off node

    —  if other nodes have elected a new PRIMARY ?

    —  if there is a Network Partition  ? 

    —  how to fulfill the promise for Strong Consistency.

> So PRIMARY eventually steps down precisely to preserve the integrity of the replica set as a whole.

Deployment Architecture Reference : , ,

What should we do to force the secondary to become PRIMARY again ?