NOTE: This is a sub-page off the main redundant cloud hosting configuration guide. If you've arrived at this page first, I recommend reviewing the parent page first.
End goal of this step
When you've completed this portion of the guide, you should have the following:
- a drbd volume in sync between both database servers
- mysql should be serving requests with its data stored on the drbd volume
- heartbeat will be managing the failover of mysql, drbd, and memcached
Getting DRBD off the ground
All of these steps will need to be done on both nodes unless otherwise specified.
You'll first need a drbd kernel module. If you're on Rackspace's Cloud Servers or Slicehost and you're using kernel 2.6.32.12-rscloud or later, then this has already been done for you. Otherwise, you'll need to install or build a DRBD kernel module for your distribution (Fedora users can run yum install drbd).
Install the relevant drbd and heartbeat packages on each node:
yum -y install vim heartbeat drbd-heartbeat drbd-utils
If you don't already have a block device ready for DRBD to use, create a loop file on each node. I'm using a 5GB DRBD volume on each node:
dd if=/dev/zero of=/drbd-loop.img bs=1024 count=5242880
You'll need an init script to set up the loopback devices at boot time for drbd.
$ cat /etc/init.d/loop-for-drbd
#!/bin/sh
#
# Startup script for drbd loop device setup
#
# chkconfig: 2345 50 50
# description: Startup script for drbd loop device setup
#
### BEGIN INIT INFO
# Provides: drbdloop
# Required-Start:
# Required-Stop:
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: set up drbd loop devices
# Description: Startup script for drbd loop device setup
### END INIT INFO
DRBD_FILEDATA_SRC="/drbd-loop.img"
DRBD_FILEDATA_DEVICE="/dev/loop7"
LOSETUP_CMD=/sbin/losetup
# Source function library
. /etc/rc.d/init.d/functions
start () {
echo -n $"Setting up DRBD loop devices..."
$LOSETUP_CMD $DRBD_FILEDATA_DEVICE $DRBD_FILEDATA_SRC
echo
}
stop() {
echo -n $"Tearing down DRBD loop devices..."
$LOSETUP_CMD -d $DRBD_FILEDATA_DEVICE
echo
}
restart() {
stop
start
}
case "$1" in
start)
start
RETVAL=$?
;;
stop)
stop
RETVAL=$?
;;
restart)
restart
RETVAL=$?
;;
*)
echo $"Usage: $0 {start|stop}"
exit 1
esac
exit $RETVALMake it executable, ensure it comes up at boot time, and make the loop devices now:
chmod +x /etc/init.d/loop-for-drbd chkconfig loop-for-drbd on /etc/init.d/loop-for-drbd start
Create a mountpoint for drbd and add it to your /etc/fstab:
mkdir -p /mnt/drbd echo "/dev/drbd0 /mnt/drbd ext3 noauto 0 0" >> /etc/fstab
For Fedora, individual drbd resources are defined in separate fils within /etc/drbd.d/r0.res. We will need one for our DRBD volume:
resource r0 {
on db1.mydomain.com {
device /dev/drbd0;
disk /dev/loop7;
address 10.1.100.10:7789;
meta-disk internal;
}
on db2.mydomain.com {
device /dev/drbd0;
disk /dev/loop7;
address 10.1.100.15:7789;
meta-disk internal;
}
}Open up /etc/drbd.d/global_common.conf and set the max rate for rebuilds in the configuration file:
syncer {
rate 100M;
}Create the resource, start DRBD and check that it's running:
drbdadm create-md r0
/etc/init.d/drbd start && chkconfig drbd on
cat /proc/drbd:
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:5242684On the primary node only, force DRBD to realize that it's the primary node:
drbdsetup /dev/drbd0 primary -o drbdadm primary all
If you run cat /proc/drbd on both nodes, you should see that they're syncing data across the network. If they're not syncing, then you might want to scroll up to ensure you didn't miss a step. You can make a filesystem on the DRBD volume while it's still syncing:
mke2fs -j /dev/drbd0 mount /dev/drbd0 [should already be in your /etc/fstab]
Install MySQL on both nodes:
yum -y install mysql-server
Set up the MySQL server on the primary node:
mysql_install_db mv /var/lib/mysql /mnt/drbd /etc/init.d/mysql start
On the secondary node, make sure MySQL has its data directory on the DRBD mount (even though it isn't mounted right now:
ln -s /mnt/drbd/mysql /var/lib/mysql
Installing memcached
Installing memcached is quite straightforward:
yum -y install memcached
I use iptables to limit access to memcached, so this step is complete.
Configuring heartbeat
Heartbeat should have been installed in the first step, so we should be able to begin configuring it on both nodes:
chkconfig heartbeat on ln -s /etc/ha.d /etc/heartbeat
On the primary node, write out an authkeys file:
echo "auth 1" >> /etc/heartbeat/authkeys
echo "1 sha1 `dd if=/dev/urandom count=1024 | sha1sum | awk '{print $1}'`" >> /etc/heartbeat/authkeys
chmod 0600 /etc/heartbeat/authkeysTake the authkeys file you made on your primary node and copy it to the same location on the secondary node. Don't forget to run chmod 0600 on the secondary node. Now we will need a heartbeat configuration file on both nodes:
$ cat /etc/heartbeat/ha.cf autojoin none logfacility local0 use_logd off keepalive 1 deadtime 10 warntime 5 initdead 20 ucast eth1 10.1.100.10 ucast eth1 10.1.100.15 auto_failback on node db1.mydomain.com db2.mydomain.com debug 1
Add a heartbeat resources file on both nodes as well:
db1.mydomain.com drbddisk::r0 Filesystem::/dev/drbd0::/mnt/drbd::ext3::rw mysqld memcached
At this point, you need to test heartbeat, which also means testing DRBD, MySQL, and memcached. Wait until DRBD has completed its sync before proceeding any further.
Start heartbeat on the primary node first and allow it to settle:
/etc/init.d/heartbeat start
Do the same on the secondary node and allow it to settle again.
I'd recommend testing some sample scenarios to make sure your configuration is airtight. Try knocking one of the nodes offline abruptly, block one node from the other via iptables, and ensure that all services start properly after a reboot. It's better to know you have a small configuration error now rather than at 4AM when you can barely see.
GlusterFS servers
I usually pick up the latest GlusterFS RPM's from Gluster's FTP server. Once you have them downloaded, just install them via yum or rpm.
You'll need some configuration files to get started and glusterfs-volgen can help with that:
glusterfs-volgen --name storage --raid 1 db1.mydomain.com:/export/glusterfs db2.mydomain.com:/export/glusterfs/
A few files will pop out after the script is finished. Two of them will be named after your server nodes (db1 and db2). Copy each one to the corresponding server in /etc/glusterfs/glusterfsd.vol. There should be another file with "client" in the name -- you'll need that later. You can delete the file that has "booster" in the name.
Start up the GlusterFS servers and ensure they start at boot time:
mkdir -p /export/glusterfs /etc/init.d/glusterfsd start chkconfig glusterfsd on
Conclusion
At this point, you should have quite a few things running. MySQL and memcached should be running on the primary node (via heartbeat) and GlusterFS servers should be running on both nodes. Ensure you've tested your failover thoroughly and you should be ready to move on to the next step.

Be careful running memcached on the same node as MySQL. Both are memory hungry - each by itself could fill a small server's memory without trying too hard.
You mention that DRBD kernel modules are already built for you if using the rscloud kernel, where to find them?
To answer my question, it is already done, as in "modprobe drbd"
.
I have a couple of questions. The first is, you've got two different hosts accessing the same ext3 partition via DRBD (which I hadn't heard about until I read this, so thank you...); wouldn't this cause a conflict from both systems trying to access the partition at the same time? I worry that reads could read outdated data thanks to block caching in the kernel, for instance. I think there are filesystems out there that can handle this case, but I haven't heard of ext3 being in that class...
My second question is more of pointing out a mistake: in the instructions for installing the MySQL server on the primary, you have "mv /var/lib/mysql /mnt/drbd", followed immediately by starting mysql, but for the secondary, you mention the need to do "ln -s /mnt/drbd/mysql /var/lib/mysql"; shouldn't you need to do the same ln command on the primary prior to starting mysql?
Vek -
Only one host has the DRBD filesystem mounted at one time. If the primary server fails, heartbeat takes over the DRBD resource on the secondary and it's then mounted there. You can set up DRBD to have filesystems up on both servers, but you will need a clustered filesystem wrapped around it, like OCFS or GFS2.
As for the MySQL instructions - good catch. I'll get that fixed up soon.
I have followed your guide exactly on CentOS 5.5, but when I reach where I need to edit /etc/drbd.d/global_common.conf I can't find the file, there is no drbd.d in etc
yum search drbd reveals:
drbd.x86_64 : Distributed Redundant Block Device driver for Linux
drbd82.x86_64 : Distributed Redundant Block Device driver for Linux
drbd83.x86_64 : Distributed Redundant Block Device driver for Linux
kmod-drbd.x86_64 : drbd kernel module(s)
kmod-drbd-xen.x86_64 : drbd kernel module(s)
kmod-drbd82.x86_64 : drbd82 kernel module(s)
kmod-drbd82-xen.x86_64 : drbd82 kernel module(s)
kmod-drbd83.x86_64 : drbd83 kernel module(s)
kmod-drbd83-xen.x86_64 : drbd83 kernel module(s)
and:
yum install drbd.x86_64 gives:
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* addons: mirrors.liquidweb.com
* base: mirrors.adams.net
* extras: holmes.umflint.edu
* updates: mirrors.cmich.edu
Setting up Install Process
Package drbd is obsoleted by drbd82, trying to install drbd82-8.2.6-1.el5.centos.x86_64 instead
Package drbd82-8.2.6-1.el5.centos.x86_64 already installed and latest version
Nothing to do
Where can I find the configuration file on CentOS 5.5? What steps do I need to do differently? Thanks!!!
After the failure on CentOS I set up Fedora boxes instead, but now I'm getting down to this step:
"Create the resource, start DRBD and check that it's running"
But I get this error on one box:
drbdadm create-md r0
drbd.d/global_common.conf:36: Parse error: 'an option keyword' expected,
but got '100M'
And this on the other:
drbdadm create-md r0
drbd.d/r0.res:5: Parse error: ':' expected,
but got ';' (TK 59)
I've been mirroring the steps on both brand new servers and have done EXACTLY the same thing on both servers, command for command.
Any ideas?
Hi Major,
I found your blog some months back, and being a RackspaceCloud customer, I find it quite interesting.
I've got a quick question for you about secure connectivity between the web nodes and the database/cache server. Would you send all data in plain text between web nodes and db/cache servers or would you secure it in some way?
Currently we've set up ssh tunnels from web nodes to db/cache servers, but I guess this isn't doing well with failover services.
Have you got any ideas of how to utilize this?
Best regards - and thanks for a lot of great reading!
/Jesper
Jesper -
One option would be to use SSL encryption on the MySQL server. This would allow your web applications to talk to MySQL over SSL and you could always set up MySQL replication over SSL later.
Hi Major and thanks for you feedback,
MySQL over SSL is absolutely an option, but as far as I've found out, there is no native SSL layer within memcached.
I've been thinking of setting up an OpenVPN network between our servers on the ServiceNet. As far as I see, it eliminates the need for setting up secure connections for each service between balancer web nodes and web nodes db/cache servers. Can you, with your experience with cloud servers, tell me if that would be a robust solution.
Best regards
/Jesper
Hi Jesper,
I did test this setup, using some different packages because i use CentOS, and it seemed to work fine.
Although when i tested failover, sometimes MYSQL crashes on the backup machine after a minute or so. In logs i see that there was a restore going on which failed.
What am i doing wrong? Wrong paths maybe? Do we need to alter the my.cnf also?
Great post thank for share
did test this setup, using some different packages because i use CentOS, and it seemed to work fine.
Although when i tested failover, sometimes MYSQL crashes on the backup machine after a minute or so. In logs i see that there was a restore going on which failed.
Hi Jesper,
I did test this setup, using some different packages because i use CentOS, and it seemed to work fine.
Although when i tested failover, sometimes MYSQL crashes on the backup machine after a minute or so. In logs i see that there was a restore going on which failed.
What am i doing wrong? Wrong paths maybe? Do we need to alter the my.cnf also?
Your article is good, however, at some point my database will become big and will need to expand my device, with a CloudServer is easy the expansion, but what about the image, how to do this without losing data and uptime?
You know that you didn't ever correct this issue...