This problem has cropped up for me a few times, but I’ve always forgotten to make a post about it. If you’re working with a large InnoDB table and you’re updating, inserting, or deleting a large volume of rows, you may stumble upon this error:

ERROR 1206 (HY000): The total number of locks exceeds the lock table size

InnoDB stores its lock tables in the main buffer pool. This means that the number of locks you can have at the same time is limited by the innodb_buffer_pool_size variable that was set when MySQL was started. By default, MySQL leaves this at 8MB, which is pretty useless if you’re doing anything with InnoDB on your server.

Luckily, the fix for this issue is very easy: adjust innodb_buffer_pool_size to a more reasonable value. However, that fix does require a restart of the MySQL daemon. There’s simply no way to adjust this variable on the fly (with the current stable MySQL versions as of this post’s writing).

Before you adjust the variable, make sure that your server can handle the additional memory usage. The innodb_buffer_pool_size variable is a server wide variable, not a per-thread variable, so it’s shared between all of the connections to the MySQL server (like the query cache). If you set it to something like 1GB, MySQL won’t use all of that up front. As MySQL finds more things to put in the buffer, the memory usage will gradually increase until it reaches 1GB. At that point, the oldest and least used data begins to get pruned when new data needs to be present.

So, you need a workaround without a MySQL restart?

If you’re in a pinch, and you need a workaround, break up your statements into chunks. If you need to delete a million rows, try deleting 5-10% of those rows per transaction. This may allow you to sneak under the lock table size limitations and clear out some data without restarting MySQL.

To learn more about InnoDB’s parameters, visit the MySQL documentation.


About a year ago, I was introduced to the joys of using irssi and screen to access irc servers. Before that time, I’d usually used graphical clients like Colloquy, and I always enjoyed getting Growl notifications when someone mentioned a word or string that I set up as a trigger.

Once I started using irssi in screen, I found that the visual bell in screen didn’t get my attention quickly. Luckily, someone in the #slicehost channel let me know about screen’s audible bell. You can flip between the visual and audible bell with CTRL-A and then CTRL-G. If you keep repeating that key combination, you’ll switch back and forth between the two (with a status update at the bottom left).

You can also set up your visual bell configuration in your .screenrc via some configuration parameters:

vbell [on|off]
vbell_msg [message]
vbellwait sec

Thanks to a recommendation from Michael and Florian, I’ve been using dsh with a lot of success for quite some time. In short, dsh is a small application which will allow you to run commands across many servers via ssh very quickly.

You may be wondering: “Why not just use ssh in a for loop?” Sure, you could do something like this in bash:

for i in`cat ~/myhosts.txt`; do ssh $i 'uptime'; done

But dsh allows you to do this:

dsh -g myhosts 'uptime'

In addition, dsh allows you to run the commands concurrently (-c) or one after the other (-w). You can tell it to prepend each line with the machine’s name (-M) or it can omit the machine name from the output (-H). If you need to pass extra options, such as which ssh key to use, or an alternative port, you can do that as well (-o). All of these command line options can be tossed into a configuration file if you have a default set of options you prefer.

Another thing that makes dsh more powerful is the groups feature. Let’s say you have three groups of servers – some are in California, others in Texas, and still others in New York. You could make three files for the groups:

  • ~/.dsh/group/california
  • ~/.dsh/group/texas
  • ~/.dsh/group/newyork

Inside each file, you just need to list the hosts one after the other. Here’s the ~/.dsh/group/texas group file:

db1.tx.mydomain.com
db2.tx.mydomain.com
web1.tx.mydomain.com
web2.tx.mydomain.com
#web3.tx.mydomain.com

As you can see, dsh handles comments in the hosts file. In the above example, the web3 server will be skipped since it’s prepended with a comment. Let’s say you want to check the uptime on all of the Texas servers as fast as possible:

dsh -c -g texas 'uptime'

That will run the uptime command on all of the servers in the Texas group concurrently. If you need to run it on two groups at once, just pass another group (eg. -g texas -g california) as an argument. You can also run the commands against all of your groups (-a).

The dsh command can really help you if you need to gather information or run simple commands on many remote servers. If you find yourself using it often for systems management, you may want to consider something like puppet.


One of my favorite (and most used) applications on any Linux machine is screen. Once you fire up a screen session, you can start something and keep it running indefinitely. Even if your internet connection drops or you accidentally close your terminal window, the screen session will remain open on the remote server.

Detaching from a screen session is done by pressing CTRL-A and then d (for detach). However, when I’m on my Mac, CTRL-A and CTRL-E send my cursor to the beginning and end of lines, respectively. Once I launch screen, I lose the CTRL-A functionality because screen thinks I’m trying to send it a command.

Luckily, this can be changed in your ~/.screenrc:

escape ^Ww

With this change, you can press CTRL-W, then press d, and you’ll detach from the screen session. For all of the screen options, run man screen on your local machine or review the man page online.


Happy New Year! I certainly hope it’s a great one for you, your family, and your business. As the new year begins, I figured it would be a good time to sit down and answer a question that I hear very often:

How do I become a better systems administrator?

The best way to become a better systems administrator is to fully understand the theory of what’s happening in your server’s environment.

What do I mean by that? Learn why things aren’t happening as you expected and think about all of the factors that could possibly be involved. Instead of thinking purely about cause and effect, you’ll find it much easier and rewarding to consider everything inside and outside your environment before you make any changes.

This still may be a little difficult to fully understand, so he’s an example. Let’s say you’re handling an issue where a customer can’t reach a website hosted on their server. When you ask them for more details, they might give you the dreaded reply: “It’s not coming up.” Start by making a mental list of the problems that are easiest to check:

  • Is the web server daemon running?
  • If a database server is being used, is it running and accessible?
  • Is there a software/hardware firewall blocking port 80?
  • Is a script stuck on the server tying up resources?
  • Could there be a DNS resolution problem?
  • Is the server up?
  • Did a switch fail?
  • Is the server’s hard disk out of space?
  • Can the customer reach other websites like Google or Yahoo?
  • If SELinux is involved, have the appropriate contexts been set?
  • Could the site be a target of a denial of service attack?
  • Has the server reached its connection tracking limit?

Of course, this is a relatively short list, but these are all easy to check. If you’re thinking about cause and effect, you might only consider the web server daemon and some basic network issues. By considering all of the other factors that may be related, you’ve ensured that all of the basics are covered before you consider more complex problems.

Most systems administrators have taken an error message and tossed it in en masse into Google before. Occasionally, no results will appear for the search. If you find yourself in this situation, try to understand the individual parts of the error message. Work outward from what you know already. You should know which daemon said it, and you may have an idea of what the application was doing when the error occurred. Take time to consider what the daemon is trying to tell you within the context of what it was doing at the time.

One of the easiest ways to force yourself to be immersed into this way of thinking is to host applications for non-technical people. You’ll find that many customers want things done differently, and they’re all at different levels of technical aptitude. Some may find it a frustrating experience at first, but you’ll think yourself later. It will force you to consider all aspects of how a server operates since you might not always know what’s happening within a customer’s application.

As always, if you find yourself stumbling, remember to ask your peers and colleagues. Even if they haven’t seen the particular issue, they will probably be able to guide you closer to the solution you seek.