Thursday, August 23, 2012

Building a Diskless Tinycore Linux Cluster


A major part of the network for our new turn-based strategy game, Just Tactics, is the pool of game servers. As the user base grows, this will have to scale up first. In designing our network, one important consideration was the ability to add new machines to the pool of game servers as easily as possible. To assist with this, they are all booting the same shared linux image over the internal network. With no configuration needed on the new server, it will look for a DHCP server, which will supply network configuration, and information on where to look for files necessary to boot. It will then download and boot a minimal operating system, and start the game server.


This document is not a complete step-by-step, and assumes some basic knowledge of the linux command line.



Requirements:

Scalability:

One major goal is to make it as easy as possible to add additional game servers as our user base grows. By booting a blank machine directly into the OS over the network, we can bring new machines online without having a per-machine installation process.

Manageability:

As the number of installed machines grows, time spent administering them becomes more important. All game servers are booting from the same linux install. Because of this, changes only need to be done in one place for all machines, which can then be rebooted into the new image as soon as the time is appropriate.

Since we are booting the same linux install on all machines, some things must be configured at boot time. We statically assign the public interface's IP address from the internal DHCP server, and set the hostname on each boot.

Security:

Since we boot a read-only ramdisk install, any filesystem changes made to the running machines are lost when the machine is rebooted. Because of this, in the event of a hacker gaining access to the machine, it will be easily restored to the previous state by a reboot.


Efficiency:

To keep overhead from the operating system minimal, we are using Tinycore Linux, an 8MB linux installation based on busybox. It is very minimal, but is appropriate for a single-purpose server such as these game servers.

Ease of Updates:

To let us quickly update the game server software, it can not be part of the base operating system install. We start the server software from an NFS share, which will allow us to update the software quickly, without rebooting the base machine.



The Basic Steps:

  • The bios must be set to boot from network, or "PXE Boot"
  • The bios broadcasts for a DHCP server
  • DHCP will provide IP information, and the location of the PXELinux bootloader
  • The bios downloads the PXELinux bootloader via TFTP
  • PXELinux downloads its configuration from the TFTP server
  • PXELinux downloads the specified kernel and Ramdisk from the TFTP server
  • PXELinux mounts the ramdisk, and boots the downloaded kernel
  • The Tinycore install loads an NFS share, and runs the game server software from the share


Software

Server Software Needed:

The server hosting the files for this netboot setup is running Ubuntu 11.10 server. We needed to install the following packages:

isc-dhcp-server
tftpd-hpa
nfs-kernel-server
syslinux

rsyslog is also necessary, but it's installed by default


Client Software Needed:

To minimize operating system overhead, we decided to use tinycore linux to boot the game servers, and run the server software from an NFS share. We needed to add a few packages to the 8MB image, which is documented here.

To help with large memory machines, tinycore has a 64 bit kernel available to be used with the 32 bit binaries.


The 64 bit kernel and ramdisk, are core64.gz and vmlinuz64. For the current release of tinycore, it is located here:



Configuration

DHCP Server

These config options needed to be added to the global section of /etc/dhcp/dhcpd.conf to enable PXE booting. The filename command gives the location of the pxelinux.0 bootloader relative to the TFTP server's path

allow booting;
allow bootp;
filename "/pxelinux.0";

To allow us to physically locate game servers, IPs must be assigned by MAC address. To avoid running two instances of dhcpd, we are statically assigning the secondary external IP addresses via dhcpd.

we need to define a custom dhcp option in the global section to pass the second IP to the client. This statement creates an option called second-ip of type ip-address:

option second-ip code 187 = ip-address;
this is the host stanza that defines the host name, and both IP addresses based on the MAC address of the second interface (eth1)
host server1 {
                hardware ethernet ab:cd:ef:12:34:56;
                fixed-address 10.1.1.5;
                option host-name "server1.example.com";
                option second-ip 172.16.1.5;
}

DHCP Client

To assign the secondary IP and default route, we needed to use the ISC DHCP client's exit-hooks feature

the second-ip option must be defined in /etc/dhclient.conf, and the option name needs to be added to the 'request' statement


option second-ip code 187 = ip-address;

request subnet-mask, broadcast-address, time-offset, routers,
        domain-name, domain-name-servers, host-name, second-ip;

/usr/local/etc/dhclient-exit-hooks is also used, to configure a few machine-specific commands. This is where the external IP is assigned

#configure second IP address
ifconfig eth0 ${new_second_ip} netmask 255.255.0.0 broadcast 172.16.1.255 up
#change the default route to the external network
/sbin/ip route del default via 10.1.1.1 dev eth1
/sbin/ip route add 0.0.0.0/0 via 172.16.1.1 dev eth0
#update the Nagios NRPE client to only listen on the internal interface, and start the daemon
sed -i "s/server_address=.*/server_address=${new_ip_address}/g" /usr/local/etc/nagios/nrpe.cfg
#set the hostname based on the one received from DHCP
/usr/bin/sethostname ${new_host_name}
/usr/local/sbin/nrpe -c /usr/local/etc/nagios/nrpe.cfg -d &


TFTP Server

the tftp server must be installed and running. In our case, we updated it to serve files out of /tftpboot, but the default config should work fine. Any files served must be world-readable


PXELinux Config

The PXELinux bootloader's default config is located  at /tftpboot/pxelinux.cfg/default. Similar to a Grub config file, you specify paths for the kernel, initial ramdisk, and any arguments to pass to the kernel.

All paths are relative to the tftp server's location. For example: /file.txt will have a full path of /tftpboot/file.txt


#this specifies which kernel is the default
DEFAULT linux
LABEL linux
#this specifies the relative path of the kernel
KERNEL /vmlinuz64
#this specifies the relative path of the initial ramdisk
INITRD /core.hts.gz
#kernel arguments are specified here
APPEND base nodhcp cron

the PXELinux config can get significantly more advanced - this is a very straightforward example


Log Management

rsyslog is used to collect logs from the PXE booted servers, since they have no local storage. It is installed by default on ubuntu, but needs modification to enable it to listen for other machines.

in /etc/rsyslog.conf, these lines must be uncommented to enable listening on TCP and UDP port 514:


# provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514

# provides TCP syslog reception
$ModLoad imtcp
$InputTCPServerRun 514


We added the following config to /etc/rsyslog.conf to sort logs into different files, based on the hostname of the sending machine:

$template DynaFile,"/var/log/remote/%HOSTNAME%.log"
*.* -?DynaFile


the game server software running on the server1 machines uses the logger command to send errors to the syslog host using the local0 facility. rsyslog will then separate out these messages into their own files, to make sure they get noticed.


if $syslogfacility-text == 'local0' and $hostname == 'server1' and $programname == 'master' then /var/log/remote/server1-err.log

if $syslogfacility-text == 'local0' and $hostname == 'server2' and $programname == 'master' then /var/log/remote/server2-err.log

if $syslogfacility-text == 'local0' and $hostname == 'server3' and $programname == 'master' then /var/log/remote/server3-err.log

Tuesday, August 21, 2012