Introduction:

This HOW-TO will describe with very few explanations how to minimally configure Heartbeat software for the purpose of automatically switching a virtual IP from one server to another when the default server fails. Here a ‘Virtual IP’ will be switched from one server to another depending on which one is available.

Install heartbeat:
apt-get install heartbeat
– Assign the Virtual IP from both servers: VIP=192.168.10.10
– Assign a name resolvable name to both servers: eg. server1 and server2
Note: The above server names should resolve to their respective fixed IPs and not to the Heartbeat Virtual IPs.

IMPORTANT: All configurations and fallback script below have to be identical on both servers.
Create and configure heartbeat main config file: /etc/ha.d/ha.cf
Note: Below the words server1 and server2 should be the resolvable(to their fixed IPs) host names of both servers.
logfile /var/log/ha-log
debugfile /var/log/ha-debug
autojoin none
ucast eth0 server1
ucast eth0 server2
udpport 694
warntime 3
deadtime 5
initdead 10
keepalive 1
node server1 server2
auto_failback on

Note 1:For more detailed information on the meaning of the above configuration directives see the link:
https://wiki.archlinux.org/index.php/Simple_IP_Failover_with_Heartbeat
Note 2: If you have dual Head NICs in your servers and the heartbeat need to communicate between each other through another NIC(eg eth1) then the configuration above would change to:
ucast eth1 server1
ucast eth1 server2

Create and configure the Heartbeat resources config file: /etc/ha.d/haresources
Here we assign server1 as the default server.
server1 IPaddr::192.168.10.10 failback.sh
NOTE: If you are running heartbeat in Debian Wheezy the the following configuration is preferred:
server1 IPaddr::192.168.10.10/24 failback.sh
Create and configure the heartbeat authentication file: /etc/ha.d/authkeys
auth 1
1 sha1 PutYourSuperSecretKeyHere

Set the permissions of /etc/ha.d/authkeys
chmod 600 /etc/ha.d/authkeys

Create the failback.sh script: /etc/ha.d/resource.d/failback.sh
on both servers:
#!/bin/bash
# Write your code here which will be run BEFORE heartbeat takes over the VIP
# Make sure the return code of this script is the same as of the last critical commands run here. Otherwise simply exit with 0
exit 0
# eof

Make the script runnable:
chmod 755 /etc/ha.d/resource.d/failback.sh
Start heartbeat in both servers
/etc/init.d/heartbeat start
Watch the logs in both servers:
tail -F /var/log/ha-log
Important note: since heartbeat uses UDP ports to communicate between each other, make sure you leave them open in your firewall(if you are using one)
eg. issue the following command to find out the ports heartbeat is using:
netstat -lupn | grep heartbeat
eg. results:
udp 0 0 0.0.0.0:35600 0.0.0.0:* 29208/heartbeat: wr
udp 0 0 0.0.0.0:694 0.0.0.0:* 29208/heartbeat: wr
udp 0 0 0.0.0.0:694 0.0.0.0:* 29206/heartbeat: wr
udp 0 0 0.0.0.0:34773 0.0.0.0:* 29206/heartbeat: wr

Thats it!
server1 should take over the VIP as default. If it fails server2 takes it over till server1 is back alive, which then reclaims back the VIP.

Extra useful info:

Changing the hostname

In case you ever change the hostname of any of the servers under the control of heartbeat, you need to do the following:
– Bring down all of the heartbeat services
– Delete the following file: /var/lib/heartbeat/hb_uuid
– Restart heartbeat in all of the concerned servers

The Fallback script:
It is important to know that the above script /etc/ha.d/resource.d/failback.sh will be run in a particular sequence regarding to the loss/regain of connection to the heartbeat on the default server. Depending on the functions of the fallback server this script can differ between the default and the fallback servers.

Sequence of events:

Transfer of IP from Default server to Fallback server
On Default Server:
==================
- Heartbeat stops responding
- Script 'failback.sh stop' is run
- VIP is no more configured to eth0:0
On Fallback Server:
===================
- No response from heartbeat on Default server
- 'failback.sh status' is run
- eth0:0 is configured with the VIP
- 'failback.sh start' is run

Transfer of IP back from Fallback server to Default server
On Fallback Server:
===================
- Response from heartbeat on Default server is back
- 'failback.sh stop' is run
- The VIP is un-configured from eth0:0
On Default Server:
==================
- Script 'failback.sh status' is run
- VIP is configured to eth0:0
- 'failback.sh start' is run
- Everything should be back to normal operation

Extras:

During Normal operation:
- If No response from the fallback heartbeat: nothing happens nowhere
- If Heartbeat on Fallback server responds again: still nothing happens nowhere