Adventures in NoSQL, part 2

In Part 1 of this blog series, "Adventures in NoSQL", I deployed a single instance of MongoDB and used Python's tweetstream module to fill a collection with a data feed from Twitter.

In the real world you wouldn't ever use a single instance of MongoDB (or twitter data :-) ) as there is no redundancy if an instance fails, all your data is gone or you need to take some time to restore it from a backup.

However, we can harness the power of a private Eucalyptus IaaS Cloud to use as our infrastructure, this means we can quickly scale out resources using direct EC2 API calls, the euca2ools command line utilities or the Eucalyptus Web interface.

In this post, I'll explore using Replication to spread your data across multiple MongoDB servers for redundancy.

Start a new instance

When you are scaling out a service, you do not want to be concerned with configuring and installing each virtual instance manually.

You could use configuration management tools like Chef, Puppet or Ansible but to keep things simple in this example I'll use Cloud-init, a tool that has become the defacto initial instance run-time configuration tool in most public clouds.

Cloud-init runs on boot and can interpret and run scripts passed within the EC2 user-data available to an instance, with euca2ools this is passed to the instance with the -f option. It's written by the awesome Scott Moser at Canonical and it can configure a number of services, install packages and arbitrary commands and even bootstrap your favourite configuration management tool.

I've created a cloud-config file that Cloud-init can use to install, setup MongoDB and set the name of the replica set:

#cloud-config
# Update apt database on first boot
apt_update: true
# Upgrade the instance on first boot
apt_upgrade: true
# Add 10gen MongoDB repo
apt_sources:
 - source: "deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen"
   keyid: 7F0CEB10
   filename: 10gen.list
# Install packages
packages:
 - mongodb-10gen
 - ntp
runcmd:
 - [ sed, -i, "s/# replSet = setname/replSet = twitterdata/g", /etc/mongodb.conf ]
 - [ restart, mongodb ]

This cloud-config script automates the commands we ran in my last blog post and updates the system to ensure all updates have been applied. It leaves us with a system already running MongoDB and the replicaset name for our MongoDB cluster ready to join the first server.

Start two new instances running Ubuntu 12.04 LTS using the cloud-init script and the security group and keypair we created in the first post:

euca-run-instances -n 2 -k mongodb -g mongodb -t c1.xlarge -f ~/path/to/mongo-db.config emi-87F63CE5

After a few moments our instances should show as 'running':

$  euca-describe-instances i-E2DF4157 i-4D433A97
RESERVATION r-8FA84324  985725263417    mongodb
INSTANCE    i-4D433A97  emi-87F63CE5    78.152.43.15    172.30.66.28    running mongodb 1       c1.xlarge   2013-02-05T18:22:10.310Z    emea-testlab-cluster1   eki-222540D6    eri-A5753DBE        monitoring-disabled 78.152.43.15    172.30.66.28            instance-store                                  
INSTANCE    i-E2DF4157  emi-87F63CE5    78.152.43.12    172.30.66.17    running mongodb 0       c1.xlarge   2013-02-05T18:22:10.297Z    emea-testlab-cluster1   eki-222540D6    eri-A5753DBE        monitoring-disabled 78.152.43.12    172.30.66.17            instance-store

Setting up MongoDB replication

Replication in MongoDB allows you to store multiple copies of your data across multiple mongod instances.

The MongoDB documentation clearly outlines how to convert a standalone system to a replica set, the commands I use below are adapted from the documentation for a Ubuntu package install.

On your standalone instance (instance 1) that we already configured MongoDB on and imported our twitter data, setup replication, ntp and restart MongoDB:

# Stop MongoDB service
sudo stop mongodb
# Install NTP
sudo apt-get install ntp -y
# Setup a replica set with the name 'twitterdata'
sudo sed -i 's/# replSet = setname/replSet = twitterdata/g' /etc/mongodb.conf
# Start MongoDB service
sudo start mongodb

Next, connect to the Mongo shell and initiate the replica set (make sure you can ping your instance hostnames, if not, setup an /etc/hosts file):

$ mongo
MongoDB shell version: 2.2.3
connecting to: test
> rs.initiate()
{
    "info2" : "no configuration explicitly specified -- making one",
    "me" : "instance1:27017",
    "info" : "Config now saved locally.  Should come online in about a minute.",
    "ok" : 1
}

Again, in the mongo shell we can now tell our first instance which members should join the replica set:

twitterdata:PRIMARY> rs.add("instance2")
{ "ok" : 1 }
twitterdata:PRIMARY> rs.add("instance3")
{ "ok" : 1 }
twitterdata:PRIMARY>

That's it! Replication is configured and our data is making it's way across to our other systems.

Replication Status

On Instance 2, let's check the status. Below you can see our prompt in the 'mongo' shell shows 'SECONDARY', back on Instance 1 this will show as 'PRIMARY'. If it's not fully synced across yet the prompt may show 'STARTUP2'.

twitterdata:SECONDARY> show dbs
local   1.203125GB
twitterstream   3.9521484375GB

On any of the three replicas we can use the command 'rs.conf()' to return the replica set configuration:

twitterdata:PRIMARY> rs.conf()
{
        "_id" : "twitterdata",
        "version" : 3,
        "members" : [
                {
                        "_id" : 0,
                        "host" : "instance1:27017"
                },
                {
                        "_id" : 1,
                        "host" : "instance2:27017"
                },
                {
                        "_id" : 2,
                        "host" : "instance3:27017"
                }
        ]
}

Or we can see the status of the replica set with 'rs.status()':

twitterdata:PRIMARY> rs.status()
{
        "set" : "twitterdata",
        "date" : ISODate("2013-02-06T10:52:08Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "instance1:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 58958,
                        "optime" : Timestamp(1360147928000, 4),
                        "optimeDate" : ISODate("2013-02-06T10:52:08Z"),
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "instance2:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 58667,
                        "optime" : Timestamp(1360147926000, 3),
                        "optimeDate" : ISODate("2013-02-06T10:52:06Z"),
                        "lastHeartbeat" : ISODate("2013-02-06T10:52:06Z"),
                        "pingMs" : 0
                },
                {
                        "_id" : 2,
                        "name" : "instance3:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 58667,
                        "optime" : Timestamp(1360147926000, 3),
                        "optimeDate" : ISODate("2013-02-06T10:52:06Z"),
                        "lastHeartbeat" : ISODate("2013-02-06T10:52:06Z"),
                        "pingMs" : 0
                }
        ],
        "ok" : 1
}

OK, so we are replicating our data, let's start our twitter streaming script again to continue to put data into MongoDB:

python tweet2mongo.py

Failure of Instances in the replica set

If we now stop MongoDB on the primary (Instance 1) replica, we notice that our script stops printing out tweets as it's lost the connection:

sudo stop mongodb

On our Secondary MongoDB replicas, you can check the status of replication:

twitterdata:SECONDARY> rs.status()
{
    "set" : "twitterdata",
    "date" : ISODate("2013-02-06T10:53:13Z"),
    "myState" : 2,
    "syncingTo" : "instance3:27017",
    "members" : [
        {
            "_id" : 0,
            "name" : "instance1:27017",
            "health" : 0,
            "state" : 8,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "optime" : Timestamp(1360147972000, 19),
            "optimeDate" : ISODate("2013-02-06T10:52:52Z"),
            "lastHeartbeat" : ISODate("2013-02-06T10:52:52Z"),
            "pingMs" : 0,
            "errmsg" : "socket exception [CONNECT_ERROR] for instance1:27017"
        },
...

We can see from the 'errmsg' field above that instance1 is down (no surprise, we stopped it!), but fortunately our data is safe, it's on the other servers in the replicaset!

To bring it back up we can start the instance1 MongoDB service again:

sudo start mongodb

Now that we've start it, our script is again adding data into MongoDB, however on closer inspection it's really just passing data through to Instance 3:

    "errmsg" : "syncing to: instance3:27017"

That's because our Instance 3 has been elected as the primary and Instance 1 needs to sync back any data it does not have. After it has done this it will become healthy again. The mongos routing process has forwarded our writes over to the primary node.

We can force Instance 1 to be primary again using a documented procedure.

In each of the mongo shells on Instance 2 and Instance 3 run:

rs.freeze(120)

rs.stepDown(120)

Make sure you run the rs.freeze command on the server that is marked as secondary and the rs.stepDown command on the server currently marked as primary.

This will force Instance 2 to not be elected as a primary for 120 seconds and cause Instance 3 to step down from being a primary and not be elected for 120 seconds, thus allowing Instance 1 to be elected as the primary in the replica set. Both Instance 2 and Instance 3 will show as syncing from Instance 1 in 'rs.status()' again and then return to a healthy state as they catch-up.

Conclusion

In the real world, instances die. Sometimes for a particular reason, perhaps they run out of resources (CPU, Memory, Disk) or sometimes because the hardware or availability zone fails. Sh*t happens, we should be prepared for it.

In part 3, I'll look at using the mongos service so that when a replica dies our application read and writes are sent to the correct server and have a look at using Sharding to split our data into ranges and spread it out across systems to give us increased horizontal write capacity.

Any thoughts so far? Drop me a comment in the box below!

TomEllis.io