There are many methods existing on the market to synchronize content of directories between hosts.
eg. rsync, lsyncd, duplicity etc. The most difficult part though is bidirectionally synchronize them so that no conflicts occur. One method is using the program Unison which works well but if a file gets modified differently on both servers between 2 syncs, we get into a conflict and Unison can’t cope with this alone. One solution could also be lsyncd but this only synchronizes in one direction. Another one which is asynchronous and syncs in both directions is ‘csync2’ but just like Unison it synchronizes only once every time it is run.

Here I will give the instructions of a very powerful synchronization system(GlusterFS) that can scale enormously and therefore also be complex to configure. To simplify things here I only show the installation and configuration of a scenario of 2 servers which need to have a single directory bidirectionally synchronized between each other.

Installing GlusterFS

Introduction:
These instructions are for a single directory to be shared and automatically synchronized between 2 hosts. If any one of the hosts goes down, upon reboot of the temporary down host GlusterFS will automatically synchronize the hosts with each other.

Principle of this installation:
– Each host is a GlusterFS server (with its directory source)and client of itself as well.
– The server directories are to be left alone(not written to)and only the synchronized clients directories are used(Read/Write).

Info:
– GlusterFS servers directories: /home/export
– Clients mounted directories: /mnt/data
– Servers are called server1 and server2
IMPORTANT:
The GlusterFS servers directories should not be written directly, only the Clients mounted directories(client sides) should be used to read and write and all other client directories will also be synchronized automatically.

Steps:

Create the needed directories in both servers:
mkdir /home/export
mkdir /mnt/data

Install the Debian squeeze packages:
apt-get install glusterfs-server
Load permanently the fuse kernel module
vim /etc/modules
Content:
fuse
Load the module manually.
Above settings provide the loading of the module at boot time but for now we load it manually, so there is no need to reboot)
modprobe fuse

SERVERS Configuration

(identical config file in both servers)
Save the Debian original config file:
mv /etc/glusterfs/glusterd.vol /etc/glusterfs/glusterd.vol.orig
Edit the new file
> /etc/glusterfs/glusterd.vol
vim /etc/glusterfs/glusterd.vol

Content:
volume posix
type storage/posix # POSIX FS translator
option directory /home/export # Export this directory
end-volume
#
volume locks
type features/locks
option mandatory on # To force locks on all the files. Recommended for MySQL database
subvolumes posix
end-volume
#
volume brick
type performance/io-threads
option thread-count 8
subvolumes locks
end-volume
#
volume server
type protocol/server
option transport-type tcp
subvolumes brick
option auth.addr.brick.allow 192.168.100.*
end-volume

Note: Make sure the directory (/home/export) exists otherwise the glusterfsd daemon will not start. Here we also assume that both servers are on the same C subnet(192.168.100.0/24)

CLIENTS CONFIGURATION

(identical in both servers)
– Create and edit the glusterfs client config file
> /etc/glusterfs/glusterfs.vol
vim /etc/glusterfs/glusterfs.vol

Content:
volume remoteFS1
type protocol/client
option transport-type tcp
option remote-host server1.srv
option remote-subvolume brick
end-volume
#
volume remoteFS2
type protocol/client
option transport-type tcp
option remote-host server2.srv
option remote-subvolume brick
end-volume
#
volume replicate
type cluster/replicate
subvolumes remoteFS1 remoteFS2
end-volume
#
volume writebehind
type performance/write-behind
option window-size 1MB
subvolumes replicate
end-volume
#
volume cache
type performance/io-cache
option cache-size 512MB
subvolumes writebehind
end-volume

Start the daemon in both servers
/etc/init.d/glusterfs-server start
Check on both server that the daemon is running
ps aux | grep gluster | grep -v grep
Example of result:
root 7488 0.0 0.3 60512 7920 ? Ssl 17:28 0:00 /usr/sbin/glusterd -p /var/run/glusterd.pid

Mounting manually the Gluster volumes in both hosts
mount -t glusterfs /etc/glusterfs/glusterfs.vol /mnt/data
Mounting Gluster FS automatically at boot time:
/etc/glusterfs/glusterfs.vol /mnt/data glusterfs defaults,_netdev 0 0
Any changes made in any of the server1.srv:/mnt/data or server2.srv:/mnt/data will be bidirectionally synchronized.

Configuring GlusterFS 3.2.6

Regular Debian Squeeze has a bug in its software which prevents the use of geo-replication.
To circumvent this bug you need to install a newer version of GlusterFS. In this article I use the version 3.2.6 which is taken from the Debian Squeeze backports Repositories.
Debian Wheezy has the version 3.2.7-3 so this is not an issue.

At some point the configuration of GlusterFS has changed drastically from the above version,
therefore here are the instructions for installing and configuring this new version.
This version uses an extra concept of creating configuration file using commands in the console.
With these console commands, volume files are created in the background which are used as configuration files.

IMPORTANT: The following instructions and commands need to be executed identically on both servers.

Note: In case you have an regular(3.0.5-1) version of GlusterFS to upgrade it is recommended to run the following command in order to make sure the older version does not interfere with the new one.
apt-get purge glusterfs-client glusterfs-server

Depending on whether you have Debian or Ubuntu use one of the following installations:

Installing GlusterFS version 3.2.6 and up (Debian Wheezy)

The following packages commands are only needed for an upgrade from Debian Squeeze GlusterFS package.
Debian Wheezy is already having to 3.2.7. and Ubuntu 14.04 Server is already at version 3.4.2.
Edit /etc/apt/sources.list and add the following line:
deb http://www.backports.org/debian/ squeeze-backports main contrib non-free
Install the Backport Debian Packages:
apt-get update
apt-get dist-upgrade
reboot
apt-get install glusterfs-server glusterfs-client

Installing GlusterFS 3.6 in Ubuntu 14.04 Server LTS

Add the following 2 lines in the file /etc/apt/sources.lst
deb http://ppa.launchpad.net/gluster/glusterfs-3.6/ubuntu trusty main.
deb-src http://ppa.launchpad.net/gluster/glusterfs-3.6/ubuntu trusty main.

Run the commands:
gpg --keyserver pgpkeys.mit.edu --recv-key 13E01B7B3FE869A9
gpg -a --export 13E01B7B3FE869A9 | apt-key add -
apt-get update
apt-get install glusterfs-client glusterfs-server

INFO: Dependencies:attr glusterfs-client glusterfs-common glusterfs-server libdevmapper-event1.02.1 libibverbs1 liblvm2app2.2 libpython2.7 librdmacm1

COMMON TO ALL INSTALLATIONS
Load permanently the fuse kernel module
vim /etc/modules
Content:
fuse
Load the module manually.
Above settings provide the loading of the module at boot time but for now we load it manually, so there is no need to reboot)
modprobe fuse

Configuring the GlusterFS servers

Prepare the directories:
Server Directory: /home/export
Client Directory: /mnt/data
mkdir /home/export
mkdir /mnt/data

Server Configuration:
Make sure both GlusterFS services are running before you start the following commands.

Commands that can be run on each server to check if the 2 servers communicate well and to initiate a Gluster Connection.
gluster peer probe server1.srv
gluster peer status

The following 2 commands need to be run on only one of the GlusterFS servers.
gluster volume create export1 replica 2 transport tcp server1.srv:/home/export server2.srv:/home/export
gluster volume start export1

Client configuration:
The following commands need to be run non both GlusterFS servers.
mount -t glusterfs localhost:/export1 /mnt/data
In /etc/fstab:
localhost:/export1 /mnt/data glusterfs defaults,_netdev 0 0
Note: The above exported device name export1 can be renamed to something else.

Displaying volumes status info

gluster volume info

Extra Note for a special problem when cloning virtual machines:
Based on: http://blog.night-shade.org.uk/2013/01/glusterfs-and-cloned-systems/
I’ve cloned the nodes from a single master but got an error about
overlapping export directories from the same peer
when creating a new volume on GlusterFS.
Turns out if you have cloned the nodes then you need to make sure you have updated the UUID in /etc/glusterd/glusterd.info to actually be unique again.

Quick fix is to create a new UUID in one of the machines so that they are not identical:
service glusterfs-server stop
echo "UUID=$(perl -e 'use UUID; UUID::generate($uuid); UUID::unparse($uuid, $string); print "$string";')" > /etc/glusterd/glusterd.info
service glusterfs-server start

NOTE 1

IMPORTANT Note for using GlusterFS for MySQL databases:
MySQL doesn’t work well with innoDB tables and GlusterFS. Mysql complains that the innoDB databases are locked by another instance of MySQL. So if you want to use GlusterFS to synchronize your databases instead of using the MySQL replication process then do the following:
– Turn OFF the use of innoDB tables
– Use external Locking of tables
Extract of MySQL configuration(/etc/mysql/my.cnf) addressing this issue:
external_locking
skip-innodb
default-storage-engine=myisam

NOTE 2

GlusterFS ports usage:
(ref: http://crashmag.net/setting-up-a-2-node-glusterfs-filesystem)
24007 – GlusterFS Daemon
24008 – Management
24009 – Each brick for every volume on your host requires it’s own port. For every new brick, one new port will be used starting at 24009. (For GlusterFS versions earlier than 3.4)
49152 – Each brick for every volume on your host requires it’s own port. For every new brick, one new port will be used starting at 49152 (GlusterFS 3.4 and later)
38465:38467 – This is required if you use the GlusterFS NFS service.

So in resume if you have the GlusterFS versions earlier than 3.4 you need to open the ports: 24007, 24008 and 24009 to your other GlusterFS server.

Some useful GLUSTERFS admin commands

gluster peer status
gluster volume info all

Deleting a GlusterFS volume

# Clean-up after deleting a volume and reuse the same directory for a new volume after volume deleted
# (to be able to reuse the same directory as new volume later)
# We are assuming here that the directory of the volume deleted is /home/shareM

voldir=/home/shareM/
service glusterfs-server stop
cd $voldir
for i in `attr -lq .`; do setfattr -x trusted.$i .; done
attr -lq $voldir (for testing, the output should be empty)
rm -rf $voldir/.glusterfs
cd /var/lib/glusterd/
rm -rf *
service glusterfs-server start
gluster peer probe {other peer host}

Happy syncing.