Wednesday, February 13, 2013

High availability in IPv6: Harder than it should be

For almost every case, there is more than one way to solve a problem.  The specific nut that I was trying to crack this time was how to share a network address between 2 nodes that are going to be setup for active/passive load balancing services.

In Linux, there are several options to choose from.  You can use the built-in piranha, keepalived, heartbeat, vrrpd, or ucarp (and there may be more options).

My needs are modest.  I just need a floating IP address, some relatively easy to configure software that I can write a puppet module for.  I don't need monitoring, health checks, or any other feature.  Bonus points if the software does not have an abundance of package requirements, as I like to keep the systems footprint slim.

I've worked with piranha before, but it's essentially abandonware at this point.  Red Hat developed it to a point, then shutdown the project and put the code in the wild.  No notable improvements have been made since the RHEL 2.1 timeframe.  Don't get me wrong, it does work (or did the last time I used it) and it comes with a passable management interface.  But like I said, I plan on using puppet to manage the configuration, so the PHP-based management interface is actually a minus for my needs.  I could have been convinced to use if if it had evolved a conf.d-style file layout, rather than a monolithic configuration file.  So that ruled out piranha, as I'm just not inspired enough to generate a puppet configuration for this project from scratch using storeconfigs to communicate cross-node data.

Next up was keepalived.  Good software, lots of features, but too many for this project, and it suffered from the same monolithic configuration file as piranha.  I looked at a few existing puppet modules for keepalived which may have worked, but I quickly wound up in module dependency hell, and none of the existing modules lived up to the implicit promise of "puppet module install" and run.  Besides, keepalived does a lot more than I need for this project, so it seemed overkill.

Next in line: heartbeat.  I've not worked with this software before, but it is something that I'm going to keep in mind for other projects.  I liked what I saw about process handling as part of the managed resources, but again, far too sophisticated for my current needs.

Down to vrrpd and ucarp.  Vrrpd is the implementation of the VRRP protocol from Cisco.  Unfortunately, it has not been updated in a long time.  So long in fact that RFC 5798 (VRRP v3 for IPv4 and IPv6) has been published since its last update 4 years ago.  I need IPv6 support, so sorry vrrpd, you were a non-starter for me.

That just leaves ucarp, OpenBSD's answer to VRRP in case Cisco decides to enforce their VRRP patents.  I've always liked OpenBSD, so I decided to kick the tires on it.

The good news: ucarp is in EPEL, installs as a single stand-alone binary, does not pull in half of the package universe for support, and each shared IP is configured in a separate configuration file.  There goes most of my requirements right there.

The bad news: ucarp does not support IPv6.  At least not the portable version of it.  I found some references to FreeBSD users talking about some issues with IPv6 support in ucarp, so someone is doing some work on IPv6 support, but it's not made it back into the main release line yet.

So now the quandry: do I roll up my sleeves, pull down the FreeBSD source, try to massage the IPv6 support into a workable state, or do I find another solution?

While I enjoy some good low-level programming, for now I found another option.  Ucarpd calls shell scripts when it brings a VIP up and down.  The scripts that ship with the EPEL package are simple wrappers around the /sbin/ip command to add/delete the addresses from the interface.

In short order, I was able to massage these scripts to synthesize IPv6 support:
#!/bin/sh
exec 2>/dev/null
PREFIX="fc00::"
/sbin/ip address add "${2}"/32 dev "${1}"
V6ADDR=$( echo "${PREFIX}${2}" | sed -e 's/\./:/g' )
/sbin/ip address add "${V6ADDR}"/128 dev "${1}"
I take advantage of the fact that a dotted quad IPv4 address can be translated into a valid IPv6 address by simply substituting "." with ":" and appending the morphed IPv4 address to an IPv6 prefix.  It's a handy trick, IMNSHO.

Voilà!  Instant IPv6 support.  Not the prettiest thing you'll ever see, but it took far less time than rewriting part of the network code inside ucarp.

There's one last piece to this puzzle, though, and even had I used one of the other solutions, I would have likely run into this part as well.  The final piece has to do with IPv6's Neighbor Discovery Protocol.

NDP is IPv6's answer to ARP in IPv4.  In IPv4 you could send out a gratuitous arp packet to preemptively announce to the network that an IP address has moved, which effectively flushes the MAC address from the ARP cache of routers, switches and same-network hosts.

I experimented with the "valid lifetime" and "preferred lifetime" options on the address, thinking that there may be a DHCP-esqe release option built into NDP, but it does not appear to be handled that way.  It's too bad, really, because it seems like it would have been a perfect use of those values -- change the address to advertise that it is good for 1 second to the network, let it expire and the network would re-discover the address on the new host.

The quick-fix blunt-force answer is to drop the NDP expiration timer on the network equipment, but studing RFC 5798 Section 8.2 it seems there is a more elegant and IPv6 friendly way of doing this without having to muck with the network's NDP expiration timers.  It looks like a Neighbor Advertisement Message Packet with the override flag set should do the trick.  Fortunately there is a security research tool out there: SI6 Networks IPv6 Toolkit, and the na6 command from that software appears to be just what I need to generate one of those packets.

Quite a bit of trial-and-error later, and the vip-up script now looks like this:

#!/bin/sh
exec 2>/dev/null
[ -f /etc/sysconfig/network-scripts/ifcfg-${1} ] && . /etc/sysconfig/network-scripts/ifcfg-${1}
PREFIX="fc00::"
/sbin/ip address add "${2}"/32 dev "${1}"
V6ADDR=$( echo "${PREFIX}${2}" | sed -e 's/\./:/g' )
/sbin/ip address add "${V6ADDR}"/128 dev "${1}"
IP6ROUTER="<router IPv6 address>"
ROUTER_MAC="<router mac address>"
/usr/sbin/na6 -i "${1}" -s "${V6ADDR}"/128 -d "${IP6ROUTER}"-t  "${IP6ROUTER}" -S "${HWADDR}" -D "${ROUTER_MAC}" -o -c -v
/bin/ping6 -c 10 "${IP6ROUTER}" &
There's a lot more moving parts in that script now.  I found it interesting that just the Network Advertisement Message with the override flag was insufficient to kick the entry out of the Neighbor table on the router; I had to send a few ICMP6 packets as well before the network equipment would pick up the address on the new master host.  I would have expected the "Hey! I'm here" NAM override packet would have been adequate.

I'm not convinced of the absolute stability of this solution yet, as I was seeing a few Duplicate Address Detection failures before I backgrounded the ping6 command because of the time that the script was taking to execute and how ucarp was handling the extended script execution time.  I may yet search the FreeBSD sources for the state of their IPv6 support and see how it looks, but for now ucarp has Frankenstein support for IPv6.

No comments:

Post a Comment