Pages

4/1/11

DNS Anycasting with IP SLA

Anycasting is a common way to steer traffic to the nearest server, while providing failover and load distribution without a hardware load balancer.  This technique is commonly used with DNS servers.  

DNS server IPs are hard coded in many servers, end user machines, and a slew of other network devices.  This makes server maintenance and unexpected outages noticeable.  With anycasting, we can automatically steer traffic to other servers with our dynamic routing protocol.

This is often done by running the user visible IP on a loopback adapter and letting the server inject this IP into your network with routing software like Quagga.  With at least two servers injecting the same IP, any one can die and traffic will failover to the working server.

Unfortunately, it's not always that easy.  There are many environments in which the server admins are not network savvy or the network admins are uncomfortable extending a dynamic routing capability to the server.  Luckily, IOS has some tools to help solve this problem, specifically IP SLA and reliable static routing.

Using IP SLA, we can set up a DNS probe to query the server's physical interface IP.  In this example, we're going to ask the DNS server (172.16.1.10) for an A record (dnscheck1.ventrefamily.com):

ip sla 1
 dns dnscheck1.ventrefamily.com name-server 172.16.1.10
 timeout 20
 frequency 20

ip sla schedule 1 life forever start-time now

With object tracking, we can monitor the IP SLA for failures.  We're specifying 61 seconds, which is 3 missed probes (20 seconds each), before we change the state to up or down.

track 1 ip sla 1
 delay down 61 up 61
 
With reliable static routing, we can install a static route as long as the track object is up.  192.168.1.1 is the IP of the loopback interface on the server; this will be the anycasted IP.  It's the IP that your clients should be using for DNS.

ip route 192.168.1.1  255.255.255.255 Vlan110 172.16.1.10 track 1

Now, with a simple redistribution, we can advertise this route into our network, and automatically remove it if the server stops responding to DNS queries.

I would recommend using a different A record query for each probe on the other servers.  If you don't, you could end up with no service if a single A record was accidentally deleted.

When you deploy this with multiple servers, you can use IOS and your IGP to automate the failover, hopefully improving the availability of your infrastructure.

No comments:

Post a Comment