Version 2.0 has arrived

EvanAs some of you may have noticed, I haven't made as many posts lately as I normally would. It's probably due to this little guy you see on the right.

Evan Michael was born earlier this month and although he's had a bit of a rough start, he's proving that he's a strong fellow. I'll try to pick up where I left off with my posts when we have him settled.


A modern implementation and explanation of Linux Virtual Server (LVS)

Load balancing via proxy

Typical configuration for a
proxy-type load balancer

A typical load balancing configuration using hardware devices or software implementations will be organized such that they resemble the diagram at the right. I usually call this a proxy-type load balancing solution since the load balancer proxies your request to some other nodes. The standard order of operations looks like this:

  • client makes a request
  • load balancer receives the request
  • load balancer sends request to a web node
  • the web server sends content back to the load balancer
  • the load balancer responds to the client

If you're not familiar with load balancing, here's an analogy. Consider a fast food restaurant. When you walk up to the counter and place an order, you're asking the person at the counter (the load balancer) for a hamburger. The person at the counter is going to submit your order, and then a group of people (web nodes) are going to work on it. Once your hamburger (web request) is ready, your order will be given to the person at the counter and then back to you.

This style of organization can become a problem as your web nodes begin to scale. It requires you to ensure that your load balancers can keep up with the requests and sustain higher transfer rates that come from having more web nodes serving a greater number of requests. Imagine the fast food restaurant where you have one person taking the orders but you have 30 people working on the food. The person at the counter may be able to take orders very quickly, but they may not be able to keep up with the orders coming out of the kitchen.

Load balancing via Linux Virtual Server

LVS allows for application servers
to respond to clients directly


This is where Linux Virtual Server (LVS) really shines. LVS operates a bit differently:

  • client makes a request
  • load balancer receives the request
  • load balancer sends request to a web node
  • the web server sends the response directly to the client

The key difference is that the load balancer sends the unaltered request to the web server and the web server responds directly to the client. Here's the fast food analogy again. If you ask the person at the counter (the load balancer) for a hamburger, that person is going to take your order and give it to the kitchen staff (the web nodes) to work on it. This time around, the person at the counter is going to advise the kitchen staff that the order needs to go directly to you once it's complete. When your hamburger is ready, a member of the kitchen staff will walk to the counter and give it directly to you.

In the fast food analogy, what are the benefits? As the number of orders and kitchen staff increases, the job of the person at the counter doesn't drastically increase in difficulty. While that person will have to handle more orders and keep tabs on which of the kitchen staff is working on the least amount of orders, they don't have to worry about returning food to customers. Also, the kitchen staff doesn't need to waste time handing orders to the person at the counter. Instead, they can pass these orders directly to the customer that ordered them.

In the world of servers, this is a large benefit. Since the web servers' responses no longer pass through the load balancer, they can spend more time on what they do best -- balancing traffic. This allows for smaller, lower-powered load balancing servers from the beginning. It also allows for increases in web nodes without big changes for the load balancers.

There are three main implementations of LVS to consider:

Linux Virtual Server LogoLVS-DR: Direct Routing
The load balancer receives the request and sends the packet directly to a waiting real server to process. LVS-DR has the best performance, but all of your servers must be on the same network subnet and they have to be able to share the same router (with no other routing devices in between them).

LVS-TUN: Tunneling
This is very similar to the direct routing approach, but the packets are encapsulated and sent directly to the real servers once the load balancer receives them. This removes the restriction that all of the devices must be on the same network. Thanks to encapsulation, you can use this method to load balance between multiple datacenters.

LVS-NAT: Network Address Translation
Using NAT for LVS yields the least performance and scaling of all of the implementation options. In this configuration, the incoming requests are rewritten so that they will be transported correctly in a NAT environment. This puts a bigger burden on the load balancer as it must rewrite the requests quickly while still keeping up with how much work is being done by each web server.


Looking for a Linux Virtual Server HOWTO? Stay tuned. I'm preparing one for my next post.


Reincarnation of Twitter's realtime XMPP search term tracking with ruby

When Twitter was still in its early stages, you could track certain search terms in near-realtime via Jabber. It was quite popular and its performance degraded over time as more users signed up and began posting updates. Eventually, Twitter killed the jabber bot altogether. Many users have asked when it will return.

Well, it hasn't returned, but you can build your own replacement with ruby, a jabber account, and a few gems. While it won't do everything that the original jabber bot did, you can still track tweets mentioning certain terms very quickly.

Here's how to get started:

First, install the tweetstream and xmpp4r-simple gems:

gem install tweetstream xmpp4r-simple

Next, you'll need a jabber account. You'll probably want to make one for the exclusive use of your jabber bot. I chose to make up a quick account at ChatMask for mine.

The last step is to drop a copy of this script on your server:

#!/usr/bin/ruby
require 'rubygems'
require 'tweetstream/client'
require 'tweetstream/hash'
require 'tweetstream/status'
require 'tweetstream/user'
require 'tweetstream/daemon'
require 'xmpp4r-simple'
 
jabber = Jabber::Simple.new('jabberbot@yourjabberserver.com','jabberpassword')
 
tweets = TweetStream::Client.new(twitterusername,twitterpassword)
 
tweets.track('celtics','lakers','finals','nba') do |status, client|
  imtext = "#{status.user.screen_name}: #{status.text} \r\n" 
  imtext += "[http://twitter.com/#{status.user.screen_name}/status/#{status.id}]"
  jabber.deliver("yourjabberusername@yourjabberserver.com",imtext)
end
 
jabber.disconnect

You'll want to be sure to fill in the following:

  • your jabber bot's username and password
  • the username and password for the twitter account that will monitor the stream
  • the search terms you want to track
  • the destination jabber account where the messages should be sent

Ensure that your jabber account has authorized the jabber bot's account so that you'll actually receive the messages. Also, Twitter is very strict with their streaming API tracking terms. It's a good idea to review their entire Streaming API documentation to ensure that you're not going to end up having your account temporarily or permanently blacklisted.

Once everything is ready to go, you can just run the script within GNU screen or via nohup. There's still a bit more error checking to do around jabber reconnections, but the script has run non-stop for well over two weeks at a time without a failure.


Parsing mdadm output with paste

My curiosity is always piqued when I find new ways to manipulate command line output in simple ways. While working on a solution to parse /proc/mdstat output, I stumbled upon the paste utility.

The man page offers a very simple description of its features:

Write lines consisting of the sequentially corresponding lines from each FILE, separated by TABs, to standard output.

Here's an example of how it works. Let's say you want to parse some software raid output that looks like this:

# mdadm --brief --verbose --detail /dev/md0
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=00.90 UUID=7bea4601:d5a02f5c:2da69848:3184a367
   devices=/dev/sda1,/dev/sdb1

It would be handy if we had both on one line as that would make it easier to parse with a script. Of course, you can do this with utilities like awk and tr, but paste makes it so much easier:

# mdadm --brief --verbose --detail /dev/md0 | paste - -
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=00.90 UUID=7bea4601:d5a02f5c:2da69848:3184a367	   devices=/dev/sda1,/dev/sdb1

By default, paste uses tabs to separate the lines, but you can use the -d argument to specify any delimiter you like:

# mdadm --brief --verbose --detail /dev/md0 | paste -d"*" - -
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=00.90 UUID=7bea4601:d5a02f5c:2da69848:3184a367*   devices=/dev/sda1,/dev/sdb1

GlusterFS on the cheap with Rackspace's Cloud Servers or Slicehost

High availability is certainly not a new concept, but if there's one thing that frustrates me with high availability VM setups, it's storage. If you don't mind going active-passive, you can set up DRBD, toss your favorite filesystem on it, and you're all set.

If you want to go active-active, or if you want multiple nodes active at the same time, you need to use a clustered filesystem like GFS2, OCFS2 or Lustre. These are certainly good options to consider but they're not trivial to implement. They usually rely on additional systems and scripts to provide reliable fencing and STONITH capabilities.

What about the rest of us who want multiple active VM's with simple replicated storage that doesn't require any additional elaborate systems? This is where GlusterFS really shines. GlusterFS can ride on top of whichever filesystem you prefer, and that's a huge win for those who want a simple solution. However, that means that it has to use fuse, and that will limit your performance.

Let's get this thing started!

Consider a situation where you want to run a WordPress blog on two VM's with load balancers out front. You'll probably want to use GlusterFS's replicated volume mode (RAID 1-ish) so that the same files are on both nodes all of the time. To get started, build two small Slicehost slices or Rackspace Cloud Servers. I'll be using Fedora 13 in this example, but the instructions for other distributions should be very similar.

First things first -- be sure to set a new root password and update all of the packages on the system. This should go without saying, but it's important to remember. We can clear out the default iptables ruleset since we will make a customized set later:

# iptables -F
# /etc/init.d/iptables save
iptables: Saving firewall rules to /etc/sysconfig/iptables:        [  OK  ]

GlusterFS communicates over the network, so we will want to ensure that traffic only moves over the private network between the instances. We will need to add the private IP's and a special hostname for each instance to /etc/hosts on both instances. I'll call mine gluster1 and gluster2:

10.xx.xx.xx gluster1
10.xx.xx.xx gluster2

You're now ready to install the required packages on both instances:

yum install glusterfs-client glusterfs-server glusterfs-common glusterfs-devel

Make the directories for the GlusterFS volumes on each instance:

mkdir -p /export/store1

We're ready to make the configuration files for our storage volumes. Since we want the same files on each instance, we will use the --raid 1 option. This only needs to be run on the first node:

# glusterfs-volgen --name store1 --raid 1 gluster1:/export/store1 gluster2:/export/store1
Generating server volfiles.. for server 'gluster2'
Generating server volfiles.. for server 'gluster1'
Generating client volfiles.. for transport 'tcp'

Once that's done, you'll have four new files:

  • booster.fstab - you won't need this file
  • gluster1-store1-export.vol - server-side configuration file for the first instance
  • gluster2-store1-export.vol - server-side configuration file for the second instance
  • store1-tcp.vol - client side configuration file for GlusterFS clients

Copy the gluster1-store1-export.vol file to /etc/glusterfs/glusterfsd.vol on your first instance. Then, copy gluster2-store1-export.vol to /etc/glusterfs/glusterfsd.vol on your second instance. The store1-tcp.vol should be copied to /etc/glusterfs/glusterfs.vol on both instances.

At this point, you're ready to start the GlusterFS servers on each instance:

/etc/init.d/glusterfsd start

You can now mount the GlusterFS volume on both instances:

mkdir -p /mnt/glusterfs
glusterfs /mnt/glusterfs/

You should now be able to see the new GlusterFS volume in both instances:

# df -h /mnt/glusterfs
Filesystem            Size  Used Avail Use% Mounted on
/etc/glusterfs/glusterfs.vol
                      9.4G  831M  8.1G  10% /mnt/glusterfs

As a test, you can create a file on your first instance and verify that your second instance can read the data:

[root@gluster1 ~]# echo "We're testing GlusterFS" > /mnt/glusterfs/test.txt
.....
[root@gluster2 ~]# cat /mnt/glusterfs/test.txt
We're testing GlusterFS

If you remove that file on your second instance, it should disappear from your first instance as well.

Obviously, this is a very simple and basic implementation of GlusterFS. You can increase performance by making dedicated VM's just for serving data and you can adjust the default performance options when you mount a GlusterFS volume. Limiting access to the GlusterFS servers is also a good idea.

If you want to read more, I'd recommend reading the GlusterFS Technical FAQ and the GlusterFS User Guide.


Thank you for your e-mails! I'll be expanding on this post later with some sample benchmarks and additional tips/tricks, so please stay tuned.