<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Racker Hacker &#187; filesystem</title>
	<atom:link href="http://rackerhacker.com/tag/filesystem/feed/" rel="self" type="application/rss+xml" />
	<link>http://rackerhacker.com</link>
	<description>Words of wisdom from a server administrator</description>
	<lastBuildDate>Wed, 16 May 2012 12:55:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Dual-primary DRBD with OCFS2</title>
		<link>http://rackerhacker.com/2011/02/13/dual-primary-drbd-with-ocfs2/</link>
		<comments>http://rackerhacker.com/2011/02/13/dual-primary-drbd-with-ocfs2/#comments</comments>
		<pubDate>Mon, 14 Feb 2011 02:12:58 +0000</pubDate>
		<dc:creator>Major Hayden</dc:creator>
				<category><![CDATA[Blog Posts]]></category>
		<category><![CDATA[centos]]></category>
		<category><![CDATA[cluster]]></category>
		<category><![CDATA[command line]]></category>
		<category><![CDATA[fedora]]></category>
		<category><![CDATA[filesystem]]></category>
		<category><![CDATA[high availability]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[ocfs2]]></category>
		<category><![CDATA[red hat]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[sysadmin]]></category>

		<guid isPermaLink="false">http://rackerhacker.com/?p=2197</guid>
		<description><![CDATA[As promised in one of my previous posts about dual-primary DRBD and OCFS2, I've compiled a step-by-step guide for Fedora. These instructions should be somewhat close to what you would use on CentOS or Red Hat Enterprise Linux. However, CentOS and Red Hat don't provide some of the packages needed, so you will need to [...]<p><a href="http://rackerhacker.com/2011/02/13/dual-primary-drbd-with-ocfs2/">Dual-primary DRBD with OCFS2</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></description>
			<content:encoded><![CDATA[<p>As promised in one of my <a href="/2010/12/02/keep-web-servers-in-sync-with-drbd-and-ocfs2/">previous posts</a> about dual-primary DRBD and OCFS2, I've compiled a step-by-step guide for Fedora.  These instructions should be somewhat close to what you would use on CentOS or Red Hat Enterprise Linux.  However, CentOS and Red Hat don't provide some of the packages needed, so you will need to use other software repositories like <a href="http://rpmfusion.org/">RPMFusion</a> or <a href="http://fedoraproject.org/wiki/EPEL">EPEL</a>.</p>
<p>In this guide, I'll be using two Fedora 14 instances in the <a href="http://rackspacecloud.com/">Rackspace Cloud</a> with separate public and private networks.  The instances are called server1 and server2 to make things easier to follow.  </p>
<p><strong>NOTE: All of the instructions below should be done on both servers unless otherwise specified.</strong></p>
<hr />
First, we need to set up DRBD with two primary nodes.  I'll be using loop files for this setup since I don't have access to raw partitions.</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">yum -y install drbd-utils
dd if=/dev/zero of=/drbd-loop.img bs=1M count=1000</pre></div></div>

<p>Put this <a href="/wp-content/uploads/2011/02/loop-for-drbd.txt">loop file initialization init script</a> in /etc/init.d/loop-for-drbd and finish setting it up:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">chmod a+x /etc/init.d/loop-for-drbd
chkconfig loop-for-drbd on
/etc/init.d/loop-for-drbd start</pre></div></div>

<p>Place this DRBD resource file in <code>/etc/drbd.d/r0.res</code>.  Be sure to adjust the server names and IP addresses for your servers.</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">resource r0 {
	meta-disk internal;
	device /dev/drbd0;
	disk /dev/loop7;
&nbsp;
	syncer { rate 1000M; }
        net { 
                allow-two-primaries; 
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
        }
	startup { become-primary-on both; }
&nbsp;
	on server1 { address 10.181.76.0:7789; }
	on server2 { address 10.181.76.1:7789; }
}</pre></div></div>

<p>The <code>net</code> section is telling DRBD to do the following:</p>
<ul>
<li><em>allow-two-primaries</em> - Generally, DRBD has a primary and a secondary node.  In this case, we will allow both nodes to have the filesystem mounted at the same time.  <strong>Do this only with a clustered filesystem. If you do this with a non-clustered filesystem like ext2/ext3/ext4 or reiserfs, <em>you will have data corruption</em>. Seriously!</strong></li>
<li><em>after-sb-0pri discard-zero-changes</em> - DRBD detected a split-brain scenario, but none of the nodes think they're a primary.  DRBD will take the newest modifications and apply them to the node that didn't have any changes.</li>
<li><em>after-sb-1pri discard-secondary</em> - DRBD detected a split-brain scenario, but one node is the primary and the other is the secondary.  In this case, DRBD will decide that the secondary node is the victim and it will sync data from the primary to the secondary automatically.</li>
<li><em>after-sb-2pri disconnect</em> - DRBD detected a split-brain scenario, but it can't figure out which node has the right data.  It tries to protect the consistency of both nodes by disconnecting the DRBD volume entirely.  You'll have to tell DRBD which node has the valid data in order to reconnect the volume.  <strong>Use extreme caution if you find yourself in this scenario.</strong></li>
</ul>
<p>If you'd like to read about DRBD split-brain behavior in more detail, <a href="http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html">review the documentation</a>.</p>
<p>I generally turn off the usage reporting functionality in DRBD within <code>/etc/drbd.d/global_common.conf</code>:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">global {
	usage-count no;
}</pre></div></div>

<p>Now we can create the volume and start DRBD:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">drbdadm create-md r0
/etc/init.d/drbd start &amp;&amp; chkconfig drbd on</pre></div></div>

<p>You may see some errors thrown about having two primaries but neither are up to date.  That can be fixed by running the following command on the <strong>primary node only</strong>:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">drbdsetup /dev/drbd0 primary -o</pre></div></div>

<p>If you run <code>cat /proc/drbd</code> on the secondary node, you should see the DRBD sync running:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9
 0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r----
    ns:0 nr:210272 dw:210272 dr:0 al:0 bm:12 lo:1 pe:2682 ua:0 ap:0 ep:1 wo:b oos:813660  
        [===&gt;................] sync'ed: 20.8% (813660/1023932)K queue_delay: 0.0 ms
        finish: 0:01:30 speed: 8,976 (6,368) want: 1024,000 K/sec</pre></div></div>

<p>Before you go any further, wait for the DRBD sync to fully finish. When it completes, it should look like this:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9
 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
    ns:0 nr:1023932 dw:1023932 dr:0 al:0 bm:63 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0</pre></div></div>

<p>Now, <strong>on the secondary node only</strong> make it a primary node as well:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">drbdadm primary r0</pre></div></div>

<p>You should see this on the secondary node if you've done everything properly:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">version: 8.3.8 (api:88/proto:86-94)
srcversion: 299AFE04D7AFD98B3CA0AF9 
 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----
    ns:1122 nr:1119 dw:2241 dr:4550 al:2 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0</pre></div></div>

<p>We're now ready to move on to configuring OCFS2.  Only one package is needed:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">yum -y install ocfs2-tools</pre></div></div>

<p>Ensure that you have your servers and their private IP addresses in <code>/etc/hosts</code> before proceeding.  Create the <code>/etc/ocfs2</code> directory and place the following configuration in <code>/etc/ocfs2/cluster.conf</code> (adjust the server names and IP addresses):</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">cluster:
	node_count = 2          
	name = web
&nbsp;
node:
	ip_port = 7777
	ip_address = 10.181.76.0
	number = 1
	name = server1
	cluster = web
&nbsp;
node:
	ip_port = 7777
	ip_address = 10.181.76.1
	number = 2
	name = server2
	cluster = web</pre></div></div>

<p>Now it's time to configure OCFS2.  Run <code>service ocfs2 configure</code> and follow the prompts.  Use the defaults for all of the responses except for two questions:</p>
<ul>
<li>Answer "y" to "Load O2CB driver on boot"</li>
<li>Answer "web" to "Cluster to start on boot"</li>
</ul>
<p>Start OCFS2 and enable it at boot up:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">chkconfig o2cb on &amp;&amp; chkconfig ocfs2 on
/etc/init.d/o2cb start &amp;&amp; /etc/init.d/ocfs2 start</pre></div></div>

<p>Create an OCFS2 partition <strong>on the primary node only</strong>:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">mkfs.ocfs2 -L &quot;web&quot; /dev/drbd0</pre></div></div>

<p>Mount the volumes and configure them to automatically mount at boot time.  You might be wondering why I do the mounting within <code>/etc/rc.local</code>.  I chose to go that route since mounting via fstab was often unreliable for me due to the incorrect ordering of events at boot time.  Using rc.local allows the mounts to work properly upon every reboot.</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">mkdir /mnt/storage
echo &quot;/dev/drbd0  /mnt/storage  ocfs2  noauto,noatime  0 0&quot; &gt;&gt; /etc/fstab
mount /dev/drbd0
echo &quot;mount /dev/drbd0&quot; &gt;&gt; /etc/rc.local</pre></div></div>

<p>At this point, you should be all done.  If you want to test OCFS2, copy a file into your /mnt/storage mount on one node and check that it appears on the other node.  If you remove it, it should be gone instantly on both nodes.  This is a great opportunity to test reboots of both machines to ensure that everything comes up properly at boot time.</p>
<p><a href="http://rackerhacker.com/2011/02/13/dual-primary-drbd-with-ocfs2/">Dual-primary DRBD with OCFS2</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></content:encoded>
			<wfw:commentRss>http://rackerhacker.com/2011/02/13/dual-primary-drbd-with-ocfs2/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Keep web servers in sync with DRBD and OCFS2</title>
		<link>http://rackerhacker.com/2010/12/02/keep-web-servers-in-sync-with-drbd-and-ocfs2/</link>
		<comments>http://rackerhacker.com/2010/12/02/keep-web-servers-in-sync-with-drbd-and-ocfs2/#comments</comments>
		<pubDate>Fri, 03 Dec 2010 02:01:12 +0000</pubDate>
		<dc:creator>Major Hayden</dc:creator>
				<category><![CDATA[Blog Posts]]></category>
		<category><![CDATA[command line]]></category>
		<category><![CDATA[filesystem]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[network]]></category>
		<category><![CDATA[sysadmin]]></category>

		<guid isPermaLink="false">http://rackerhacker.com/?p=1846</guid>
		<description><![CDATA[The guide to redundant cloud hosting that I wrote recently will need some adjustments as I've fallen hard for the performance and reliability of DRBD and OCFS2. As a few of my sites were gaining in popularity, I noticed that GlusterFS simply couldn't keep up. High I/O latency and broken replication threw a wrench into [...]<p><a href="http://rackerhacker.com/2010/12/02/keep-web-servers-in-sync-with-drbd-and-ocfs2/">Keep web servers in sync with DRBD and OCFS2</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></description>
			<content:encoded><![CDATA[<p>The <a href="/redundant-cloud-hosting-configuration-guide/">guide to redundant cloud hosting</a> that I wrote recently will need some adjustments as I've fallen hard for the performance and reliability of DRBD and OCFS2.  As a few of my sites were gaining in popularity, I noticed that GlusterFS simply couldn't keep up.  High I/O latency and broken replication threw a wrench into my love affair with GlusterFS and I knew there had to be a better option.<br />
<div id="attachment_1987" class="wp-caption alignright" style="width: 310px"><a href="http://rackerhacker.com/wp-content/uploads/2010/12/drbd-and-ocfs2-e1291337653403.png"><img src="http://rackerhacker.com/wp-content/uploads/2010/12/drbd-and-ocfs2-e1291337653403.png" alt="DRBD, OCFS2, apache, varnish, and LVS" title="DRBD, OCFS2, apache, varnish, and LVS" width="300" height="300" class="size-full wp-image-1987" /></a><p class="wp-caption-text">Diagram of two web nodes with a replicated filesystem using DRBD &#038; OCFS2</p></div>I've shared my configuration with my coworkers and I've received many good questions about it.  Let's get to the Q&#038;A:</p>
<p><b>How does the performance compare to GlusterFS?</b><br />
On Gluster's best days, the data throughput speeds were quite good, but the latency to retrieve the data was often much too high.  Page loads on this site were taking upwards of 3-4 seconds with GlusterFS latency accounting for well over 75% of the delays.  For small files, GlusterFS's performance was about 20-25x slower than accessing the disk natively.  The performance hit for DRBD and OCFS2 is usually between 1.5-3x for small files and difficult to notice for large file transfers.</p>
<p><b>Couldn't you keep the data separate and then sync it with rsync?</b><br />
Everyone knows that rsync can be a resource consuming monster and it seems wasteful to call rsync via a cron job to keep my data in sync.  There are some periods of the day where the actual data on the web root rarely changes.  There are other times where it changes rapidly and I'd end up with nodes out of sync for a few minutes.</p>
<p>To get the just-in-time synchronization that I want, I'd have to run rsync at least once a minute.  If the data isn't changing over a long period, rsync would end up crushing the disk and consuming CPU for no reason.  DRBD only syncs data when data changes.  Also, all reads with DRBD are done locally.  This makes is a highly efficient and effective choice for instant synchronization.</p>
<p><b>Why OCFS2? Isn't that overkill?</b><br />
When you use DRBD in dual-primary mode, it's functionally equivalent to having a raw storage device (like a SAN) mounted in two places.  If you threw an ext4 filesystem onto a LUN on your SAN and then mounted it on two different servers, you'd be in bad shape very quickly.  Non-clustered filesystems like ext3 or ext4 can't handle being mounted in more than one environment.</p>
<p>OCFS2 is built primarily to be mounted in more than one place and it comes with its own distributed locking manager (DLM).  The configuration files for OCFS2 are extremely simple and you mount it like any other filesystem.  It's been part of the mainline Linux kernel since 2.6.19.</p>
<p><b>What happens when you lose one of the nodes?</b><br />
The configuration shown above can operate with just one node in an emergency.  When the failed node comes back online, DRBD will resync the block device and you can mount the OCFS2 filesystem as you normally would.</p>
<p><b>You're using an Oracle product? Really?</b><br />
You've got me there.  I'm not a fan of how they treat the open source community with regards to some of their projects, but the OCFS2 filesystem is robust, free, and it meets my needs.</p>
<p><b>Where's the how-to?</b><br />
It's coming soon!  Stay tuned.</p>
<p><a href="http://rackerhacker.com/2010/12/02/keep-web-servers-in-sync-with-drbd-and-ocfs2/">Keep web servers in sync with DRBD and OCFS2</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></content:encoded>
			<wfw:commentRss>http://rackerhacker.com/2010/12/02/keep-web-servers-in-sync-with-drbd-and-ocfs2/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Switching from GlusterFS to DRBD and OCFS2</title>
		<link>http://rackerhacker.com/2010/11/10/switching-from-glusterfs-to-drbd-and-ocfs2/</link>
		<comments>http://rackerhacker.com/2010/11/10/switching-from-glusterfs-to-drbd-and-ocfs2/#comments</comments>
		<pubDate>Wed, 10 Nov 2010 13:55:50 +0000</pubDate>
		<dc:creator>Major Hayden</dc:creator>
				<category><![CDATA[Blog Posts]]></category>
		<category><![CDATA[command line]]></category>
		<category><![CDATA[drbd]]></category>
		<category><![CDATA[filesystem]]></category>
		<category><![CDATA[glusterfs]]></category>
		<category><![CDATA[ocfs2]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://rackerhacker.com/?p=1850</guid>
		<description><![CDATA[As my uptime reports have shown, and as some of you have reported, my blog's load time has increased steadily over the past few weeks. It turns out that one of my VM's was on a physical machine that had some trouble and I was reaching a point where GlusterFS's replicate functionality couldn't meet my [...]<p><a href="http://rackerhacker.com/2010/11/10/switching-from-glusterfs-to-drbd-and-ocfs2/">Switching from GlusterFS to DRBD and OCFS2</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></description>
			<content:encoded><![CDATA[<p>As my uptime reports have shown, and as some of you have reported, my blog's load time has increased steadily over the past few weeks.  It turns out that one of my VM's was on a physical machine that had some trouble and I was reaching a point where GlusterFS's replicate functionality couldn't meet my performance needs.</p>
<p>Instead of using <a href="http://en.wikipedia.org/wiki/GlusterFS">GlusterFS</a> as I had before in my <a href="/redundant-cloud-hosting-configuration-guide/">redundant cloud hosting guide</a>, I decided to use <a href="http://en.wikipedia.org/wiki/DRBD">DRBD</a> in dual-primary mode with <a href="http://en.wikipedia.org/wiki/OCFS">OCFS2</a> as the clustering filesystem on top of it.  The performance is quite good so far:</p>
<div id="attachment_1851" class="wp-caption aligncenter" style="width: 630px"><a href="http://rackerhacker.com/wp-content/uploads/2010/11/pingdomresponsetime-rackerhacker.com_.png"><img src="http://rackerhacker.com/wp-content/uploads/2010/11/pingdomresponsetime-rackerhacker.com_.png" alt="Pingdom Response Time Graph for rackerhacker.com" title="Pingdom Response Time Graph for rackerhacker.com" width="620" height="339" class="size-full wp-image-1851" /></a><p class="wp-caption-text">Pingdom Response Time Graph for rackerhacker.com</p></div>
<p>I switched over the DNS late last night and the response time has fallen from the two to three second range (during times of low load) to right around one second per request.  In addition to the reduced load times, I can support higher concurrency without significant performance degradation.</p>
<p>Don't worry - I'll make a detailed post on this topic later along with a guide on how to set it up yourself.</p>
<p><a href="http://rackerhacker.com/2010/11/10/switching-from-glusterfs-to-drbd-and-ocfs2/">Switching from GlusterFS to DRBD and OCFS2</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></content:encoded>
			<wfw:commentRss>http://rackerhacker.com/2010/11/10/switching-from-glusterfs-to-drbd-and-ocfs2/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A simple guide to redundant cloud hosting</title>
		<link>http://rackerhacker.com/2010/08/17/a-simple-guide-to-redundant-cloud-hosting/</link>
		<comments>http://rackerhacker.com/2010/08/17/a-simple-guide-to-redundant-cloud-hosting/#comments</comments>
		<pubDate>Wed, 18 Aug 2010 00:41:16 +0000</pubDate>
		<dc:creator>Major Hayden</dc:creator>
				<category><![CDATA[Blog Posts]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[cloud servers]]></category>
		<category><![CDATA[command line]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[fedora]]></category>
		<category><![CDATA[filesystem]]></category>
		<category><![CDATA[high availability]]></category>
		<category><![CDATA[iptables]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[load balancing]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[network]]></category>
		<category><![CDATA[networking]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[rackspace]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[slicehost]]></category>
		<category><![CDATA[ssl]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[yum]]></category>

		<guid isPermaLink="false">http://rackerhacker.com/?p=1771</guid>
		<description><![CDATA[Today, on my 28th birthday, I'm finally delivering on a promise to my readers which I made about two months ago. I've written a guide on how to host a web application redundantly in a cloud environment. While it's still a bit of a rough draft, it should be a good starting point for those [...]<p><a href="http://rackerhacker.com/2010/08/17/a-simple-guide-to-redundant-cloud-hosting/">A simple guide to redundant cloud hosting</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></description>
			<content:encoded><![CDATA[<p>Today, on my 28th birthday, I'm finally delivering on a promise to my readers which I made about two months ago.  I've <a href="/redundant-cloud-hosting-configuration-guide/">written a guide</a> on how to host a web application redundantly in a cloud environment.  While it's still a bit of a rough draft, it should be a good starting point for those who haven't worked in virtualized environments before.  Also, it may show some of the more experienced systems administrators a new way to do things.</p>
<p>The guide: <a href="/redundant-cloud-hosting-configuration-guide/">Redundant Cloud Hosting Guide</a></p>
<p>As always, if you find anything in the guide that needs improvement, I'm all ears. <img src='http://rackerhacker.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p><a href="http://rackerhacker.com/2010/08/17/a-simple-guide-to-redundant-cloud-hosting/">A simple guide to redundant cloud hosting</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></content:encoded>
			<wfw:commentRss>http://rackerhacker.com/2010/08/17/a-simple-guide-to-redundant-cloud-hosting/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>GlusterFS on the cheap with Rackspace&#039;s Cloud Servers or Slicehost</title>
		<link>http://rackerhacker.com/2010/05/27/glusterfs-on-the-cheap-with-rackspaces-cloud-servers-or-slicehost/</link>
		<comments>http://rackerhacker.com/2010/05/27/glusterfs-on-the-cheap-with-rackspaces-cloud-servers-or-slicehost/#comments</comments>
		<pubDate>Fri, 28 May 2010 00:34:10 +0000</pubDate>
		<dc:creator>Major Hayden</dc:creator>
				<category><![CDATA[Blog Posts]]></category>
		<category><![CDATA[command line]]></category>
		<category><![CDATA[filesystem]]></category>
		<category><![CDATA[glusterfs]]></category>
		<category><![CDATA[high availability]]></category>
		<category><![CDATA[rackspace]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[sysadmin]]></category>

		<guid isPermaLink="false">http://rackerhacker.com/?p=1464</guid>
		<description><![CDATA[NOTE: This post is out of date and is relevant only for GlusterFS 2.x. High availability is certainly not a new concept, but if there's one thing that frustrates me with high availability VM setups, it's storage. If you don't mind going active-passive, you can set up DRBD, toss your favorite filesystem on it, and [...]<p><a href="http://rackerhacker.com/2010/05/27/glusterfs-on-the-cheap-with-rackspaces-cloud-servers-or-slicehost/">GlusterFS on the cheap with Rackspace's Cloud Servers or Slicehost</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></description>
			<content:encoded><![CDATA[<p><em><b style="color: red">NOTE:</b> This post is out of date and is relevant only for GlusterFS 2.x.</em></p>
<hr />
High availability is certainly not a new concept, but if there's one thing that frustrates me with high availability VM setups, it's storage.  If you don't mind going active-passive, you can set up <a href="http://en.wikipedia.org/wiki/Drbd">DRBD</a>, toss your favorite filesystem on it, and you're all set.</p>
<p>If you want to go active-active, or if you want multiple nodes active at the same time, you need to use a clustered filesystem like <a href="http://en.wikipedia.org/wiki/Global_File_System">GFS2</a>, <a href="http://en.wikipedia.org/wiki/OCFS">OCFS2</a> or <a href="http://en.wikipedia.org/wiki/Lustre_(file_system)">Lustre</a>.  These are certainly good options to consider but they're not trivial to implement.  They usually rely on additional systems and scripts to provide reliable <a href="http://en.wikipedia.org/wiki/Fencing_(computing)">fencing</a> and <a href="http://en.wikipedia.org/wiki/STONITH">STONITH</a> capabilities.</p>
<p>What about the rest of us who want multiple active VM's with simple replicated storage that doesn't require any additional elaborate systems?  This is where <a href="http://en.wikipedia.org/wiki/GlusterFS">GlusterFS</a> really shines.  GlusterFS can ride on top of whichever filesystem you prefer, and that's a huge win for those who want a simple solution.  However, that means that it has to use <a href="http://en.wikipedia.org/wiki/Filesystem_in_Userspace">fuse</a>, and that will limit your performance.</p>
<p><strong>Let's get this thing started!</strong></p>
<p>Consider a situation where you want to run a WordPress blog on two VM's with load balancers out front.  You'll probably want to use GlusterFS's replicated volume mode (RAID 1-ish) so that the same files are on both nodes all of the time.  To get started, build two small Slicehost slices or Rackspace Cloud Servers.  I'll be using Fedora 13 in this example, but the instructions for other distributions should be very similar.</p>
<p>First things first -- be sure to set a new root password and update all of the packages on the system.  This should go without saying, but it's important to remember.  We can clear out the default iptables ruleset since we will make a customized set later:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;"># iptables -F
# /etc/init.d/iptables save
iptables: Saving firewall rules to /etc/sysconfig/iptables:        [  OK  ]</pre></div></div>

<p>GlusterFS communicates over the network, so we will want to ensure that traffic only moves over the private network between the instances.  We will need to add the private IP's and a special hostname for each instance to <code>/etc/hosts</code> on both instances.  I'll call mine <code>gluster1</code> and <code>gluster2</code>:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">10.xx.xx.xx gluster1
10.xx.xx.xx gluster2</pre></div></div>

<p>You're now ready to install the required packages on both instances:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">yum install glusterfs-client glusterfs-server glusterfs-common glusterfs-devel</pre></div></div>

<p>Make the directories for the GlusterFS volumes on each instance:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">mkdir -p /export/store1</pre></div></div>

<p>We're ready to make the configuration files for our storage volumes.  Since we want the same files on each instance, we will use the <code>--raid 1</code> option.  <strong>This only needs to be run on the first node:</strong></p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;"># glusterfs-volgen --name store1 --raid 1 gluster1:/export/store1 gluster2:/export/store1
Generating server volfiles.. for server 'gluster2'
Generating server volfiles.. for server 'gluster1'
Generating client volfiles.. for transport 'tcp'</pre></div></div>

<p>Once that's done, you'll have four new files:</p>
<ul>
<li><code>booster.fstab</code> - you won't need this file</li>
<li><code>gluster1-store1-export.vol</code> - server-side configuration file for the first instance</li>
<li><code>gluster2-store1-export.vol</code> - server-side configuration file for the second instance</li>
<li><code>store1-tcp.vol</code> - client side configuration file for GlusterFS clients</li>
</ul>
<p>Copy the <code>gluster1-store1-export.vol</code> file to <code>/etc/glusterfs/glusterfsd.vol</code> on your first instance.  Then, copy <code>gluster2-store1-export.vol</code> to <code>/etc/glusterfs/glusterfsd.vol</code> on your second instance.  The <code>store1-tcp.vol</code> should be copied to <code>/etc/glusterfs/glusterfs.vol</code> on both instances.</p>
<p>At this point, you're ready to start the GlusterFS servers on each instance:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">/etc/init.d/glusterfsd start</pre></div></div>

<p>You can now mount the GlusterFS volume on both instances:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">mkdir -p /mnt/glusterfs
glusterfs /mnt/glusterfs/</pre></div></div>

<p>You should now be able to see the new GlusterFS volume in both instances:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;"># df -h /mnt/glusterfs
Filesystem            Size  Used Avail Use% Mounted on
/etc/glusterfs/glusterfs.vol
                      9.4G  831M  8.1G  10% /mnt/glusterfs</pre></div></div>

<p>As a test, you can create a file on your first instance and verify that your second instance can read the data:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">[root@gluster1 ~]# echo &quot;We're testing GlusterFS&quot; &gt; /mnt/glusterfs/test.txt
.....
[root@gluster2 ~]# cat /mnt/glusterfs/test.txt
We're testing GlusterFS</pre></div></div>

<p>If you remove that file on your second instance, it should disappear from your first instance as well.</p>
<p>Obviously, this is a very simple and basic implementation of GlusterFS.  You can increase performance by making dedicated VM's just for serving data and you can adjust the default performance options when you mount a GlusterFS volume.  Limiting access to the GlusterFS servers is also a good idea.</p>
<p>If you want to read more, I'd recommend reading the <a href="http://www.gluster.com/community/documentation/index.php/GlusterFS_Technical_FAQ">GlusterFS Technical FAQ</a> and the <a href="http://www.gluster.com/community/documentation/index.php/GlusterFS_User_Guide">GlusterFS User Guide</a>.</p>
<hr />
<strong>Thank you for your e-mails!</strong> I'll be expanding on this post later with some sample benchmarks and additional tips/tricks, so please stay tuned.</p>
<p><a href="http://rackerhacker.com/2010/05/27/glusterfs-on-the-cheap-with-rackspaces-cloud-servers-or-slicehost/">GlusterFS on the cheap with Rackspace's Cloud Servers or Slicehost</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></content:encoded>
			<wfw:commentRss>http://rackerhacker.com/2010/05/27/glusterfs-on-the-cheap-with-rackspaces-cloud-servers-or-slicehost/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>ext3_dx_add_entry: Directory index full!</title>
		<link>http://rackerhacker.com/2008/10/13/ext3_dx_add_entry-directory-index-full/</link>
		<comments>http://rackerhacker.com/2008/10/13/ext3_dx_add_entry-directory-index-full/#comments</comments>
		<pubDate>Mon, 13 Oct 2008 17:00:51 +0000</pubDate>
		<dc:creator>Major Hayden</dc:creator>
				<category><![CDATA[Blog Posts]]></category>
		<category><![CDATA[emergency]]></category>
		<category><![CDATA[filesystem]]></category>
		<category><![CDATA[fsck]]></category>

		<guid isPermaLink="false">http://rackerhacker.com/?p=510</guid>
		<description><![CDATA[I found a server last week that was having severe issues with disk I/O to the point where most operations were taking many minutes to complete. The server wasn't under much load, but a quick run of dmesg threw quite a bit of these lines out onto the screen: EXT3-fs warning (device sda5): ext3_dx_add_entry: Directory [...]<p><a href="http://rackerhacker.com/2008/10/13/ext3_dx_add_entry-directory-index-full/">ext3_dx_add_entry: Directory index full!</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></description>
			<content:encoded><![CDATA[<p>I found a server last week that was having severe issues with disk I/O to the point where most operations were taking many minutes to complete.  The server wasn't  under much load, but a quick run of <code>dmesg</code> threw quite a bit of these lines out onto the screen:</p>
<p><code>EXT3-fs warning (device sda5): ext3_dx_add_entry: Directory index full!</code></p>
<p>After a thorough amount of searching, I couldn't find out what the error actually meant.  As with most errors starting with <code>EXT3-fs warning</code>, I figured that a fsck might be the best option.</p>
<p>During the fsck, several inodes were repaired and the check completed after 10-15 minutes.  I jotted down some notes about the directories that popped up on the screen during the fsck.  The server rebooted it came up without any problems.  </p>
<p>I reviewed the directories that appeared during the fsck and they were full of files.  Some of the directories contained upwards of 200,000 files.  Many of the files were moved into lost+found after the fsck, so they had to be restored from their backups.  I still don't know what caused the original issue as the hardware checked out fine.  If you run into this error, a fsck should help, but make sure that you have backups handy.</p>
<p><a href="http://rackerhacker.com/2008/10/13/ext3_dx_add_entry-directory-index-full/">ext3_dx_add_entry: Directory index full!</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></content:encoded>
			<wfw:commentRss>http://rackerhacker.com/2008/10/13/ext3_dx_add_entry-directory-index-full/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>What is the difference between file data and metadata?</title>
		<link>http://rackerhacker.com/2008/03/12/what-is-the-difference-between-file-data-and-metadata/</link>
		<comments>http://rackerhacker.com/2008/03/12/what-is-the-difference-between-file-data-and-metadata/#comments</comments>
		<pubDate>Wed, 12 Mar 2008 18:01:59 +0000</pubDate>
		<dc:creator>Major Hayden</dc:creator>
				<category><![CDATA[command line]]></category>
		<category><![CDATA[filesystem]]></category>

		<guid isPermaLink="false">http://rackerhacker.com/2008/03/12/what-is-the-difference-between-file-data-and-metadata/</guid>
		<description><![CDATA[Just in case some of you out there enjoy nomenclature and theory behind Linux filesystems, here's some things to keep in mind. The modification time (mtime) of a file describes when the actual data blocks that hold the file changed. The changed time (ctime) of a file describes when the metadata was last changed. Also, [...]<p><a href="http://rackerhacker.com/2008/03/12/what-is-the-difference-between-file-data-and-metadata/">What is the difference between file data and metadata?</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></description>
			<content:encoded><![CDATA[<p>Just in case some of you out there enjoy nomenclature and theory behind Linux filesystems, here's some things to keep in mind.  The modification time (mtime) of a file describes when the actual data blocks that hold the file changed.  The changed time (ctime) of a file describes when the metadata was last changed.</p>
<p>Also, metadata is stored within a different location than the data blocks.  The metadata fits in the inode while the file's data goes within data blocks.  The inode information contains the owner, owner's group, time related data (atime, ctime, mtime), and the mode (permissions).</p>
<p>The name of the file itself is actually stored within the file that makes up the directory.  And, the directory is simply a file that masquerades as a directory once the filesystem is mounted and read.</p>
<p><a href="http://rackerhacker.com/2008/03/12/what-is-the-difference-between-file-data-and-metadata/">What is the difference between file data and metadata?</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></content:encoded>
			<wfw:commentRss>http://rackerhacker.com/2008/03/12/what-is-the-difference-between-file-data-and-metadata/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>EXT3-fs error (device hda3) in start_transaction: Journal has aborted</title>
		<link>http://rackerhacker.com/2007/11/20/ext3-fs-error-device-hda3-in-start_transaction-journal-has-aborted/</link>
		<comments>http://rackerhacker.com/2007/11/20/ext3-fs-error-device-hda3-in-start_transaction-journal-has-aborted/#comments</comments>
		<pubDate>Tue, 20 Nov 2007 18:23:40 +0000</pubDate>
		<dc:creator>Major Hayden</dc:creator>
				<category><![CDATA[Blog Posts]]></category>
		<category><![CDATA[command line]]></category>
		<category><![CDATA[emergency]]></category>
		<category><![CDATA[filesystem]]></category>

		<guid isPermaLink="false">http://rackerhacker.com/2007/11/20/ext3-fs-error-device-hda3-in-start_transaction-journal-has-aborted/</guid>
		<description><![CDATA[If your system abruptly loses power, or if a RAID card is beginning to fail, you might see an ominous message like this within your logs: EXT3-fs error (device hda3) in start_transaction: Journal has aborted Basically, the system is telling you that it's detected a filesystem/journal mismatch, and it can't utilize the journal any longer. [...]<p><a href="http://rackerhacker.com/2007/11/20/ext3-fs-error-device-hda3-in-start_transaction-journal-has-aborted/">EXT3-fs error (device hda3) in start_transaction: Journal has aborted</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></description>
			<content:encoded><![CDATA[<p>If your system abruptly loses power, or if a RAID card is beginning to fail, you might see an ominous message like this within your logs:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">EXT3-fs error (device hda3) in start_transaction: Journal has aborted</pre></div></div>

<p>Basically, the system is telling you that it's detected a filesystem/journal mismatch, and it can't utilize the journal any longer.  When this situation pops up, the filesystem gets mounted read-only almost immediately.  To fix the situation, you can remount the partition as ext2 (if it isn't your active root partition), or you can commence the repair operations.</p>
<p>If you're working with an active root partition, you will need to boot into some rescue media and perform these operations there.  If this error occurs with an additional partition besides the root partition, simply unmount the broken filesystem and proceed with these operations.</p>
<p>Remove the journal from the filesystem (effectively turning it into ext2):</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;"># tune2fs -O ^has_journal /dev/hda3</pre></div></div>

<p>Now, you will need to fsck it to correct any possible problems (throw in a -y flag to say yes to all repairs, -C for a progress bar):</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;"># e2fsck /dev/hda3</pre></div></div>

<p>Once that's finished, make a new journal which effectively makes the partition an ext3 filesystem again:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;"># tune2fs -j /dev/hda3</pre></div></div>

<p>You should be able to mount the partition as an ext3 partition at this time:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;"># mount -t ext3 /dev/hda3 /mnt/fixed</pre></div></div>

<p>Be sure to check your dmesg output for any additional errors after you're finished!</p>
<p><a href="http://rackerhacker.com/2007/11/20/ext3-fs-error-device-hda3-in-start_transaction-journal-has-aborted/">EXT3-fs error (device hda3) in start_transaction: Journal has aborted</a> is a post from: Major Hayden's <a href="http://rackerhacker.com">Racker Hacker</a> blog. 
<p>Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.</p></p>
]]></content:encoded>
			<wfw:commentRss>http://rackerhacker.com/2007/11/20/ext3-fs-error-device-hda3-in-start_transaction-journal-has-aborted/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>

