Note: In case you have an regular(3.0.5-1) version of GlusterFS to upgrade, it is recommended to run the following command in order to make sure the older version gets cleaned-up does not interfere with the new one.
apt-get purge glusterfs-client glusterfs-server
Depending on whether you have Debian or Ubuntu use one of the following installations:
INFO: Dependencies:attr glusterfs-client glusterfs-common glusterfs-server libdevmapper-event1.02.1 libibverbs1 liblvm2app2.2 libpython2.7 librdmacm1
Installing the newest GlusterFs server and client packages:
Run the commands:
echo "deb http://ppa.launchpad.net/gluster/glusterfs-3.6/ubuntu trusty main" >>/etc/apt/sources.list
echo "deb-src http://ppa.launchpad.net/gluster/glusterfs-3.6/ubuntu trusty main" >>/etc/apt/sources.list
gpg --keyserver pgpkeys.mit.edu --recv-key 13E01B7B3FE869A9
gpg -a --export 13E01B7B3FE869A9 | apt-key add -
apt-get update
apt-get install glusterfs-client glusterfs-server
Example of syncing directories in 2 servers:
Introduction: The principle of GlusterFS is quite special. In the case of syncing directories in 2 different Linux Boxes, we install the server and client part of GlusterFS in each Linux box. The server part in each Linux box gets configured to host a Volume which will then be mounted locally to a ‘mountpoint’ This ‘mountpoint’ the applications and users work directory. To prevent corruption, the server volume directory should NEVER get modified by users or applications, except from the GlusterFS daemon itself of course. As soon as the Volumes are mounted on each mountpoint in each Linux Box, the content of each mountpoint will be bidirectionally mirrored with each other between the Linux Boxes.
How to start the sync? I make sure that each server is well functioning and the directories that need to be synchronized are containing exactly the same file/directories. Then I rename each of those directories with an ‘M’ at the end(for GlusterFS Master) and create a new empty directory with the original directory name(Client directory). Later the GlusterFS master directory (with the ‘M’ at the end) will form a Volume which get mounted on the client directory(original dir. name).
For example in the tutorial below each Linux Box will get 2 directories:
/dataM/ (Server Volume: workspace of ONLY GlusterFS Daemon.
/data/ (Client directory: workspace of users and applications)
Final result:
/dataM/ ==> (assigned as 'export1' volume) ==mounted on ==>> /data/
CONFIGURATION
To be executed in both Linux Boxses
Load permanently the fuse kernel module
echo "fuse" >> /etc/modules
Load the module manually.
Above settings provide the loading of the module at boot time but for now we load it manually, so there is no need to reboot.
modprobe fuse
Configuring the GlusterFS servers
Here we assume that we have 2 Linux boxes: server1.srv and server2.srv
Preparing the directories in each Linux Box:
Server Directory: /dataM/
Client Directory: /data/
mkdir /dataM/
mkdir /data/
Server Configuration:
Make sure both GlusterFS services are running before you start the following commands.
Commands that should be run on each server to check if the 2 servers communicate well and to initiate a Gluster Connection.
In server1.srv run:
gluster peer probe server2.srv
gluster peer status
In server2.srv run:
gluster peer probe server1.srv
gluster peer status
The following 2 commands need to be run on only one Linux Box.
gluster volume create export1 replica 2 transport tcp server1.srv:/dataM server2.srv:/dataM
gluster volume start export1
Client configuration:
The following commands need to be run on both GlusterFS Linux Boxes.
mount -t glusterfs -o defaults,_netdev localhost:/export1 /data/
Add the following line in /etc/fstab:
localhost:/export1 /data glusterfs defaults,_netdev 0 0
Note 1: In some distributions (Like in Ubuntu 14.04) very often at boot time, the network is not ready when the glusterfs daemon is loaded. This prevents the above mount line in /etc/fstab to take effect. To remedy to that, I create the following @reboot root cronjob which will wait a few seconds after boot, restart the GlusterFs daemon and mount the client mountpoint:
root cronjob:
@reboot /bin/sleep 20 ; /root/bin/Mount_GlusterFS
Content of /root/bin/Mount_GlusterFS script:
#!/bin/bash
if ! (mount | grep -q 'localhost:/export1'); then
mount -t glusterfs -o defaults,_netdev localhost:/export1 /data
fi
Note: The above exported device name export1 can be renamed to something else.
Displaying volumes status info:
gluster volume info
Practical example
I show here a practical example of a setup of synchronizing 3 different directories in a newly installed Gitlab redundant system(Load balancer and 2 GitLab servers). I gave each GitLab server a short name: gitlab1 and gitlab2 which needs to declared(see below) in the /etc/hosts.
Assumptions:
– GlusterFS master .ssh directory: /home/git/.sshM
– GitLab client .ssh directory: /home/git/.ssh
– GlusterFS master Git repositories: /home/git/repositoriesM
– GitLab client Git repositories: /home/git/repositories
– GlusterFS master uploads directories: /home/git/gitlab/public/uploadsM
– GitLab client upload directories: /home/git/gitlab/public/uploads
The following commands were run to configure Gitlab:
#on Gitlab1 and Gitlab2
===================
# Added in /etc/hosts
#-----FOR GLUSTERFS-------
192.168.100.11 gitlab1
192.168.100.12 gitlab2
#-------------------------
#
#on Gitlab1
=========
gluster peer probe gitlab2
gluster peer status
#
#on Gitlab2
=========
gluster peer probe gitlab1
gluster peer status
#
#on Gitlab1 or Gitlab2(but only on one of them)
======================================
gluster volume create ssh replica 2 transport tcp gitlab1:/home/git/.sshM gitlab2:/home/git/.sshM
gluster volume create repos replica 2 transport tcp gitlab1:/home/git/repositoriesM gitlab2:/home/git/repositoriesM
gluster volume create uploads replica 2 transport tcp gitlab1:/home/git/gitlab/public/uploadsM gitlab2:/home/git/gitlab/public/uploadsM
#
gluster volume start ssh
gluster volume start repos
gluster volume start uploads
#
#on Gitlab1
==========
mount -t glusterfs localhost:/ssh /home/git/.ssh
mount -t glusterfs localhost:/repos /home/git/repositories
mount -t glusterfs localhost:/uploads /home/git/gitlab/public/uploads
#
#on Gitlab2
=========
mount -t glusterfs localhost:/ssh /home/git/.ssh
mount -t glusterfs localhost:/repos /home/git/repositories
mount -t glusterfs localhost:/uploads /home/git/gitlab/public/uploads
#
#Display the volumes information:
gluster volume info
#
# Root crontab(on both gitlab1 and gitlab2)
==================================
@reboot sleep 30; service glusterfs-server restart ; mount -t glusterfs localhost:/ssh /home/git/.ssh ;mount -t glusterfs localhost:/repos /home/git/repositories; mount -t glusterfs localhost:/uploads /home/git/gitlab/public/uploads
Extra Note for a special problem when cloning virtual machines:
Based on: http://blog.night-shade.org.uk/2013/01/glusterfs-and-cloned-systems/
I’ve cloned the nodes from a single master but got an error about
overlapping export directories from the same peer
when creating a new volume on GlusterFS.
Turns out if you have cloned the nodes then you need to make sure you have updated the UUID in /etc/glusterd/glusterd.info to actually be unique again.
Quick fix is to create a new UUID in one of the machines so that they are not identical:
service glusterfs-server stop
echo "UUID=$(perl -e 'use UUID; UUID::generate($uuid); UUID::unparse($uuid, $string); print "$string";')" > /var/lib/glusterd/glusterd.info
service glusterfs-server start
NOTE 1
MySQL doesn’t work well with innoDB tables and GlusterFS. Mysql complains that the innoDB databases are locked by another instance of MySQL. So if you want to use GlusterFS to synchronize your databases instead of using the MySQL replication process, then do the following:
– Turn OFF the use of innoDB tables
– Use external Locking of tables as follows:
Extract of MySQL configuration(/etc/mysql/my.cnf) addressing this issue:
external_locking
skip-innodb
default-storage-engine=myisam
NOTE 2
GlusterFS ports usage:
(ref: http://crashmag.net/setting-up-a-2-node-glusterfs-filesystem)
24007 – GlusterFS Daemon
24008 – Management
24009 – Each brick for every volume on your host requires it’s own port. For every new brick, one new port will be used starting at 24009. (For GlusterFS versions earlier than 3.4)
49152 – Each brick for every volume on your host requires it’s own port. For every new brick, one new port will be used starting at 49152 (GlusterFS 3.4 and later)
38465:38467 – This is required if you use the GlusterFS NFS service.
So in resume if you have the GlusterFS versions earlier than 3.4 you need to open the ports: 24007, 24008 and 24009 to your other GlusterFS server.
Some useful GLUSTERFS admin commands
gluster peer status
gluster volume info all
Deleting a GlusterFS volume
If after having created a volume with a local directory you decide to delete the volume and use the same directory for creating a new volume, you will most likely be presented with an error which says that the directory is already in use by a volume. Well, damn!! you just deleted this volume. So you check again if the volume is really been deleted and it IS deleted. The reason for that is that when you deleted a volume via the gluster volume delete command some meta data doesn’t get deleted and needs to be cleaned up before you can reuse the same directory. It took me some researches to find this one out. Here is one solution I found:
Clean-up everything of GlusterFS
The following script will do a full clean-up of the GlusterFS environment including the metadata on synced file/directories related to the use of GlusterFS. This should leave your system totally clean of GlusterFS traces. You can then start over again by installing and configuring it as you want it. At least that is what I recommend when your GlusterFS starts to behave in very unpredictable ways and especially after deleting a volume and wanting to create a new volume based on the same directories again. Of course it’s possible to just delete the volume and do some manual clean-up, but I recommend do a ‘Tabla-rasa’ after spending a lot of time trying to fix a broken GlusterFS setup.
Note: We are assuming here that the directory of the GlusterFS volume deleted is /dataM
Create a script with the following content, do a chmod 755 on it and run it.
#!/bin/bash
# Purpose: Cleans-up a GlusterFS environment before starting fresh over again
# Syntax: gluster_cleanup.sh vol_dir1 [vol_dir2] [....];
#--------------------------------------------------------------
# Check the command syntax
if [ $# -lt 1 ]; then
echo "ERROR: Wrong number of arguments"
echo "Usage: gluster_cleanup.sh vol_dir1 [vol_dir2] [....]"
exit 1
fi
# Check the validity of the given directories
for dir ; do
if ! [ -d $dir ]; then
echo "ERROR: Given directory($dir) is invalid"
echo "Usage: gluster_cleanup.sh vol_dir1 [vol_dir2] [....]"
exit 1
fi
done
# Delete the previous line up.
deleteline () {
dellineup="\033[A\033[2K"
echo ; echo -en $dellineup
}
# Getting confirmation from user
echo "WARNING!!!: This script will COMPLETELY uninstall glusterFS and its components"
echo "including the volume(s), glusterFS mount(s)s, bricks, GlusterFS configuration and volume directories metadata"
echo "It will leave the glusterFS volumes directories with original clean files/dirs., ready to reinstall and configure GlusterFS again"
echo "Is it what you want to do? [y/N]: " ; read answer
# exit if answer is anything else than 'y' or 'Y'
if [ $answer != "y" -a $answer != "Y" ] ; then exit 2 ; fi
# ------- now do the clean-up ------------
echo "Stopping the glusterFS service daemon"
service glusterfs-server stop
echo "De-installing and purging GlusterFS install packages"
apt-get -y purge glusterfs-server glusterfs-common glusterfs-client
echo "Deleting the left-over GlusterFS compiled python scripts directory: /usr/lib/x86_64-linux-gnu/glusterfs/"
rm -rf /usr/lib/x86_64-linux-gnu/glusterfs/
echo "Deleting the GlusterFS work directory: /var/lib/glusterd"
rm -rf /var/lib/glusterd
echo "Starting the clean-up recursively of file/directories attributes"
for dir ; do
echo "Deleting the work file $dir/.glusterfs"
rm -rf $dir/.glusterfs
echo "Deleting recursively the file/dir. attributes in $dir"
for file in $(find $dir); do
deleteline
echo -ne $file
for att in $(attr -lq $file) ; do
setfattr -h -x trusted.${att} $file
done
done
deleteline
echo;echo "Verifying that all the attibutes have been deleted."
if [ -z $(for file in $(find $dir); do attr -lq $file; done) ]; then echo -e "Success\n----------------------------" ; fi
done
Troubleshooting GlusterFS
Trouble:
When copying a file into the mounted volume (client side) the following error appears:
stale file handle
Solution:
Although some might not see that as a solution but for me it seems that the error disappeared after I ran this command, although no files got rebalanced:
gluster v rebalance {volumename} fix-layout start
gluster v rebalance {volumename} status
Happy syncing.