High Availability Apache on Ubuntu 8.04

It's nice when your website keeps serving pages even after something catastrophic happens. Running two Apache nodes with Heartbeat gets you there -- if one server blows up, the other takes over in short order. ## Prelude You'll need two boxes and *three* IP addresses. I'm using [virtual machines from Xeriom Networks](http://xeriom.net/). Both have been [firewalled](http://barkingiguana.com/2008/06/22/firewall-a-pristine-ubuntu-804-box), and I've opened the HTTP port to the world: ```bash sudo iptables -I INPUT 3 -p tcp --dport http -j ACCEPT sudo sh -c "iptables-save -c > /etc/iptables.rules" ``` For this post, let's assume the following IP addresses are available: * 193.219.108.236 -- Node 1 (craig-02.vm.xeriom.net) * 193.219.108.237 -- Node 2 (craig-03.vm.xeriom.net) * 193.219.108.238 -- Not assigned (this becomes our floating IP) ## Simple service First, install Apache on both boxes. Nothing fancy -- we just want to confirm we can serve *something* over HTTP. Run this on both boxes: ```bash sudo apt-get install apache2 --yes ``` Open a browser and hit the IP addresses for Node 1 and Node 2. You should see the default Apache page saying "It works!". If you don't, check your firewall allows `www` traffic. Your rules should look like this -- note the line ending `tcp dpt:www`: ``` sudo iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT tcp -- anywhere anywhere tcp dpt:ssh ACCEPT tcp -- anywhere anywhere tcp dpt:www DROP all -- anywhere anywhere Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ``` ## Adding resilience Apache can serve pages from both machines now, which is great -- but it doesn't protect against one of them dying. For that, we use Heartbeat. Install Heartbeat on both boxes: ```bash sudo apt-get install heartbeat ``` Copy the sample configuration files to Heartbeat's config directory: ```bash sudo cp /usr/share/doc/heartbeat/authkeys /etc/ha.d/ sudo sh -c "zcat /usr/share/doc/heartbeat/ha.cf.gz > /etc/ha.d/ha.cf" sudo sh -c "zcat /usr/share/doc/heartbeat/haresources.gz > /etc/ha.d/haresources" ``` Lock down `authkeys` -- it's going to contain a password: ```bash sudo chmod go-wrx /etc/ha.d/authkeys ``` Edit `/etc/ha.d/authkeys` and add a password of your choice: ``` auth 2 2 sha1 your-password-here ``` Configure `ha.cf` for your network. The node names **must** match the output of `uname -n` on each box: ``` logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 30 initdead 120 bcast eth0 udpport 694 auto_failback on node craig-02.vm.xeriom.net node craig-03.vm.xeriom.net ``` Now tell Heartbeat to manage Apache. Edit `haresources` on both machines -- the contents must be identical on both nodes, and the hostname should be the output of `uname -n` on Node 1: ``` craig-02.vm.xeriom.net 193.219.108.238 apache2 ``` The IP address here is the unassigned one from the prelude -- it becomes the floating virtual IP. Since we told Heartbeat to use UDP port 694, we need to open it in the firewall on both boxes: ```bash sudo iptables -I INPUT 2 -p udp --dport 694 -j ACCEPT ``` Your iptables rules should now look like: ``` sudo iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT udp -- anywhere anywhere udp dpt:694 ACCEPT tcp -- anywhere anywhere tcp dpt:ssh ACCEPT tcp -- anywhere anywhere tcp dpt:www DROP all -- anywhere anywhere Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ``` Create a file on each box so we can tell which server is responding: ```bash # Node 1 (craig-02.vm.xeriom.net) echo "craig-02.vm.xeriom.net" > /var/www/index.html ``` ```bash # Node 2 (craig-03.vm.xeriom.net) echo "craig-03.vm.xeriom.net" > /var/www/index.html ``` Hit each node's IP address in your browser to confirm the right content is showing. If it works, it's time to flip the switch. ## Bringing it to life Start Heartbeat on the master (Node 1) first, then the slave (Node 2): ```bash sudo /etc/init.d/heartbeat start ``` This takes a while to start up. Run `tail -f /var/log/ha-log` on both boxes to watch progress. After a bit, you should see Node 1 report something like: ``` heartbeat[6792]: 2008/06/24_11:06:21 info: Initial resource acquisition complete (T_RESOURCES(us)) IPaddr[6867]: 2008/06/24_11:06:22 INFO: Running OK heartbeat[6832]: 2008/06/24_11:06:22 info: Local Resource acquisition completed. ``` ## Testing for a broken heart Check `ifconfig eth0:0` on both boxes. You should see output like this: ``` # Node 1 sudo ifconfig eth0:0 eth0:0 Link encap:Ethernet HWaddr 00:16:3e:3c:70:25 inet addr:193.219.108.238 Bcast:193.219.108.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 ``` ``` # Node 2 sudo ifconfig eth0:0 eth0:0 Link encap:Ethernet HWaddr 00:16:3e:92:ad:78 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 ``` Node 1 has claimed the virtual IP address. If Node 1 dies, Node 2 takes over. Simulate a failure by stopping Heartbeat on Node 1: ```bash # Node 1 sudo /etc/init.d/heartbeat stop ``` Check `ifconfig` again -- the virtual IP should now be on Node 2. Bring Node 1 back up and it should reclaim the IP. If this all worked, congratulations -- Heartbeat is running and your web tier will survive a node failure. Skip ahead to see it in the browser. If you see messages about the message queue filling up, the two nodes can't talk to each other. Double-check that UDP port 694 is open on *both* boxes: ``` heartbeat[6148]: 2008/06/24_11:05:09 ERROR: Message hist queue is filling up (500 messages in queue) ``` Verify the firewall rules -- the important line ends with `udp dpt:694`: ``` sudo iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT udp -- anywhere anywhere udp dpt:694 ACCEPT tcp -- anywhere anywhere tcp dpt:ssh ACCEPT tcp -- anywhere anywhere tcp dpt:www DROP all -- anywhere anywhere Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ``` ## The proof is in the pudding Open your browser and hit the virtual IP address (193.219.108.238 in this example). You should see Node 1's page. Stop Heartbeat on Node 1 (or shut it down entirely) and refresh. You should now see Node 2. Bring Node 1 back up and refresh once more. You're back on Node 1. That's high availability in action. If one server goes down, your users never notice.