ALICE has come up with a special request from the monitoring system. They want to have DNS-based load balancing of the central services, taking into account the monitoring information. Since all the monitoring is written in Java it would be nice to have the DNS server also in Java.

MonALISA is installed on every machine, so we can embed the following monitoring parameters in the DNS server:

  • machine availability (whether or not ML is alive)
  • service availability (ML periodically checks each services’ status)
  • machine parameters (cpu, memory and swap usage, load, etc)

A simple load balancing in DNS is easily achieved in any server by specifying multiple IN A entries for a host name. The server will return these entries using a round-robin algorithm to the clients. This is good enough if:

  • the servers are always up and running
  • the machines are identical and client requests load the machines in more or less the same way

In our case none of these conditions are true. The servers are different, can become overloaded at any time and can (and do) crash. So the DNS server has to return the entries not in round-robin but in an order consistent with the the load, excluding services that are not available at all.

The implementation is based on the excellent dnsjava package that was already used by ML for direct lookups. I started from the sample server available in the package (jnamed.java). It needed some minor modifications since it was using configuration files and I wanted to have everything done from the main ML configuration and from the code. So I added a new constructor that looks for the bind address / ports in the ML configuration. Also a new method that directly adds a new zone was needed. The result is the lia.util.dns.DNSServer class.

At this step we have a DNS server that, in the default implementation, can be dynamically customized. But the package only offers round-robin lookups for the aliases. A very simple solution is to override the RRset and change the implementation so that it returns the aliases in a random order, but based on some weights associated with each entry. In our case the weights will be based on a score compiled from several service metrics, so for example a machine with a load twice as big as another will be twice less likely to be returned. The load is only one of the factors considered, so if a machine is really more loaded than another it will receive a very different score. See the attached code for lia.util.dns.WeightedRRSet for details.

Now the only thing missing is the integration. This is very simple and it goes like this:


// the server is a singleton
final DNSServer dns = DNSServer.getInstance();

// host name for which we can answer
final Name name = Name.fromConstantString("alien-proxy.cern.ch.");

// the server that can answer for this name = this machine
final Name myName = Name.fromConstantString("pcalice295.cern.ch.");

// server administrator in DNS notation (account.server)
final Name adminName = Name.fromConstantString("grigoras.cern.ch.");

// add some basic records
final Record[] zoneRecords = new Record[2];

zoneRecords[0] = new SOARecord(name, DClass.IN, 1, myName, adminName, System.currentTimeMillis()/1000, 5, 5, 5, 5);
zoneRecords[1] = new NSRecord(name, DClass.IN, 5, myName);

final Zone zone = new Zone(name, zoneRecords);

// create a couple of weighted records
final WeightedRRSet rr = new WeightedRRSet();

final Record r1 = new ARecord(name, DClass.IN, 5, InetAddress.getByName("1.2.3.4"));
final Record r2 = new ARecord(name, DClass.IN, 5, InetAddress.getByName("2.3.4.5"));

rr.addRR(r1, 1d);
rr.addRR(r2, 2d);

// put the weighted round-robin structure in the zone
zone.addRRset(rr);

// make the DNS server aware of this zone
dns.addZone(name, zone);


That’s all folks, now the server on my own machine will answer to IN A request about alien-proxy.cern.ch with 2.3.4.5 twice more likely than 1.2.3.4.

One more trick left to disclose. Since we always run from a regular user account we cannot bind to lower ports. With other services (http for example) this is not a problem since we can specify a different port number in the URLs. But DNS has to be on 53… We have two options here:

  • give Java sudo privileges (and also to kill otherwise the process will belong to the root and we will not be able to kill it from the user account afterwards). This is a pretty messy solution.
  • the alternative is to have a port redirect in place. This I like 🙂

A port redirect is very easy to implement with iptables, like this: (root account obviously)

# /sbin/iptables -t nat -I PREROUTING -p tcp --dport 53 -j REDIRECT --to-ports 1053
# /sbin/iptables -t nat -I PREROUTING -p udp --dport 53 -j REDIRECT --to-ports 1053

You have to add the following lines to the ML repository configuration:

lia.util.dns.DNSServer.ports=1053
lia.util.dns.DNSServer.bind=0.0.0.0

The end.

Recipe:

  • The implementation for the AliEn Proxy load balancing: ProxyLoadBalancer.java. The static part is initialized at the first usage of the class. To make sure the DNS server starts as soon as possible ca can add the following line in repository’s App.properties file:
    lia.Monitor.JiniClient.Store.load_on_startup = loadbalancer.ProxyLoadBalancer
  • DNSServer.java – modified version of the library sample code.
  • WeightedRRSet.java – weights attached to the entries.
  • DNSServerTest.java – testing code.
  • Add the latest and greatest dnsjava package.
  • Code for 30 minutes and serve it hot 😉
  • Highlight your special code.

Big thanks to Ramiro for the support!