Rackspace Cloud Tech Podcast Episode 2

I participated in a podcast for the Rackspace Cloud with Robert Collazo last week. We covered some important topics including network security and convenient deployment tools.


Sticky shift key with synergy in Fedora 12

My synergy setup at work is relatively simple. I have a MacBook Pro running Snow Leopard that acts as a synergy server and a desktop running Fedora 12 as a synergy client. On the Mac, I use SynergyKM to manage the synergy server. The Fedora box uses my gdm strategy for starting synergy at the login screen and in GNOME.

I kept having an issue where the shift key would become stuck regardless of the settings I set for the client or server. The halfDuplexCapsLock configuration option had no effect. After installing xkeycaps, I found that both shift keys were getting stuck if I brought the mouse back and forth between Mac and Fedora twice.

I decided to run a test. I started the client with the debug argument and moved the mouse to my Fedora box. At that point, I pressed the letter 'a' and saw:

DEBUG1: CXWindowsKeyState.cpp,195:   032 (00000000) up
DEBUG1: CXWindowsKeyState.cpp,195:   03e (00000000) up
DEBUG1: CXWindowsKeyState.cpp,195:   026 (00000000) down
DEBUG1: CXWindowsKeyState.cpp,195:   032 (00000000) down
DEBUG1: CXWindowsKeyState.cpp,195:   03e (00000000) down
DEBUG1: CXWindowsKeyState.cpp,195:   026 (00000000) up

I brought the mouse back to the Mac and then back to Fedora. I pressed 'a' again and saw:

DEBUG1: CXWindowsKeyState.cpp,195:   026 (00000000) down
DEBUG1: CXWindowsKeyState.cpp,195:   026 (00000000) up
DEBUG1: CXWindowsKeyState.cpp,195:   026 (00000000) down
DEBUG1: CXWindowsKeyState.cpp,195:   026 (00000000) up

After dumping the keyboard layout with xmodmap I found the keys that corresponded with the key numbers:

  • 032 - Left shift
  • 03e - Right shift
  • 026 - a

If I tapped the left shift, I could clear the key press, but I couldn't clear the right shift key (it was stuck down according to Fedora's X server). When I hooked up a physical keyboard and mouse, I was able to use them normally without any keybinding problems.

The root cause: When synergy started in /etc/gdm/PreSession/Default after the gdm login, the keyboard layout wasn't set up properly. The X server was setting up the keyboard layout later in the startup process and this confusion caused the shift keys to get stuck. Fedora 12 uses evdev to probe for keyboards during X's startup and eventually settles on a default layout if none are explicitly defined.

The fix: I added the synergy startup to the GNOME startup items and it works flawlessly.


Private network interfaces: the forgotten security hole

Regardless of the type of hosting you're using - dedicated or cloud - it's important to take network interface security seriously. Most often, threats from the internet are the only ones mentioned. However, if you share a private network with other customers, you have just as much risk on that interface.

Many cloud providers allow you access to a private network environment where you can exchange data with other instances or other services offered by the provider. The convenience of this access comes with a price: other instances can access your instance on the private network just as easily as they could on the public interface.

Here are some security tips for your private interfaces:

Disable the private interface
This one is pretty simple. If you have only one instance or server, and you don't need to communicate privately with any other instances, just disable the interface. Remember to configure your networking scripts to leave the interface disabled after reboots.

Use packet filtering
The actual mechanism will vary based on your operating system, but filtering packets is the one of the simplest ways to secure your private interface. You can take some different approaches with them, but I find the easiest method is to allow access from your other instances and reject all other traffic.

For additional security, you can limit access based on ports as well as source IP addresses. This could prevent an attacker from having easy access to your other instances if they're able to break into one of them.

Configure your daemons to listen on the appropriate interfaces
If there are services that don't need to be listening on the private network, don't allow them to listen on your private interface. For example, MySQL might need to listen on the private interface so the web server can talk to it, but apache won't need to listen on the private interface. This reduces the profile of your instance on the private network and makes it a less likely target for attack.

Use hosts.allow and hosts.deny
Many new systems administrators forget about how handy tcpwrappers can be for limiting access. If your firewall is down in error, host.allow and hosts.deny could be an extra layer of protection. It's important to ensure that the daemons you are attempting to control are build with tcpwrappers support. Daemons like sshd support it, but apache and MySQL do not.

Encrypt all traffic on the private network
Just because it's called a "private" network doesn't mean that your traffic can traverse the network privately. You should always err on the side of caution and encrypt all traffic traversing the private network. You can use ssh tunnels, stunnel, or the built-in SSL features found in most daemons.

This also brings up an important point: you should know how your provider's private network works. Are there safeguards to prevent sniffing? Could someone else possibly ARP spoof your instance's private IP addresses? Is your private network's subnet shared among many customers?

With all of that said, it's also very important to have proper change control policies so that administrators working after you are fully aware of the security measures in place and why they are important. This will ensure that all of the administrators on your instances will understand the security of the system and they should be able to make sensible adjustments later for future functionality.


System Administration Inspiration: If it's broken, break it a little more

Earlier this year, I started a series of posts to encourage systems administrators to refine their troubleshooting abilities. This is the second post in that series.

Almost every system administrator has found themselves in a situation where they're confronted with a server which has a problem. However, if you're not the primary administrator for the server, you may not always know what has changed recently or you may not be aware of changes in the server's environment. In these situations, if the fix isn't obvious, try going through these steps:

Localize the problem to a specific daemon or service
In the case of a problem where a website isn't loading properly, is it a problem with the web server itself? Could something other than the actual web server daemon be having an issue?

As an example, consider a ruby on rails application which runs through apache's mod_proxy_balancer and queries data from MySQL. If any of those individual puzzle pieces were not functioning correctly, you'd get a different result. A downed MySQL instance could make the application throw errors or appear to be unresponsive. If the mongrel cluster had failed, apache might be returning internal server errors. Your browser might return a connection refused if apache was down. These are all relatively easy to determine.

What if you are unable to determine which daemon is causing the problem?

If it's broken, break it a little more
Let's say that you've reviewed the process list and all of the appropriate daemons appear to be running. However, the website is still not loading properly. What do you do? Bring down a service and try again. Did something change? Did a new error appear? If not, bring that daemon back up and try taking down one of the other ones.

I've also had some good results by making small adjustments in the web server's configuration file. If you have a virtual host that isn't returning the correct data, try commenting it out temporarily. For rewrite rules, try removing them temporarily or strip them down to a more basic form. Test again, and then begin adding lines back incrementally. As much as a single period or quotation mark can derail a perfectly good set of rewrite rules.

In short - try to think outside the box when you're troubleshooting a difficult issue on an unfamiliar system. Always remember to back up your configurations before making changes and ensure your daemons will start properly if you bring them down.


MySQL: The total number of locks exceeds the lock table size

If you're running an operation on a large number of rows within a table that uses the InnoDB storage engine, you might see this error:

ERROR 1206 (HY000): The total number of locks exceeds the lock table size

MySQL is trying to tell you that it doesn't have enough room to store all of the row locks that it would need to execute your query. The only way to fix it for sure is to adjust innodb_buffer_pool_size and restart MySQL. By default, this is set to only 8MB, which is too small for anyone who is using InnoDB to do anything.

If you need a temporary workaround, reduce the amount of rows you're manipulating in one query. For example, if you need to delete a million rows from a table, try to delete the records in chunks of 50,000 or 100,000 rows. If you're inserting many rows, try to insert portions of the data at a single time.

Further reading: