10. Containers

While installation of gemBS is quite straightforward, in some cases it may be difficult or not desirable to install the required third-party libraries, development support etc. as they might clash with previously installed versions that are required for some other tools. Incompatibilities between the versions of third party libraries can lead to subtle bugs that are difficult to track down because they can depend on the specific environment of a user. Containers provide a simpler manner for the distribution and installation of tools and pipelines in that the tools are distributed as a package with all required dependencies. Previously installed packages should have no effect on the running of the container and, conversely, installing a new container should have no effect on previously installed packages.

Two commonly used container frameworks are Docker and Singularity. A Singularity container for gemBS is available, and this provides a simple and rapid means of installing and using gemBS with the assurance that the package is installed and working correctly.

10.1. Singularity and gemBS

The first requirement is to install Singularity on your system, following the instructions on the Singularity site. After that it is necessary to install the gemBS Singularity container, which you can either build yourself or you can download a pre-built container.

Getting and installing the gemBS singularity container

There are then several options to install the gemBS Singularity container. The first (and simplest) option is to install a pre-built container from Singularity Hub. Here you should find pre-built images from the latest versions present in the github repository. The container can be downloaded using the following command:

singularity pull shub://heathsc/gemBS-rs

Once complete you should have a file called something like heathsc-gemBS-master-latest.simg, Changing the name of the image to gemBS and making it executable will give a gemBS binary that should work almost like a native installation of gemBS.

mv gemBS.simg gemBS
chmod 755 gemBS

The second method is to build it yourself from the gemBS Singularity recipe in the base directory of the gemBS github repository. Download the recipe from here. The image can then be built using the following command:

sudo singularity build gemBS.simg Singularity

where Singularity is the name of the gemBS recipe that you downloaded. If everything worked correctly then it should create the file gemBS.simg in the current directory. This can then be renamed and made executable as described above.

Using the gemBS singularity container

By default when the gemBS container is run, the filesystem containing the user’s home directory is mounted in the singularity container and the working directory inside the container is set to the current working directory outside of the container. From the user’s perspective, therefore, running gemBS from the container behaves the same way as running a native installation. This is not always true if the working directory is outside of the user’s home directory filesystem.

In some cases the working directory inside the container is set to the user’s home directory, and the working directory outside of the container (which is on a different file system) will not be accessible by default to the container. If this problem occurs, we can get around it as described in the following section.

We can instruct singularity to mount additional disks inside the container (assuming the user has read access to the disks). To do this we can not perform the shortcut of simply executing the container image; instead we have to use the singularity exec command with the --bind host_disk:container_mount_point option to mount the desired host disk at a given mount point in the container. Depending on how singularity has been configured on the host system, it might be required that container_mount_point exists in the container; for that purpose the gemBS container has potential mount points /ext/disk1, /ext/disk2,..., /ext/disk9 that can be used.

singularity exec --bind host_disk:singularity_mount_point /path/to/gemBS/container.simg /usr/local/bin/gemBS <gemBS commands>

The host_disk will then be available from within the container under the mount point singularity_mount_point. As commented earlier, if the working directory is not on the home file system then the working directory within the container will be set to the users home. It is therefore simplest to put the configuration directory for gemBS, the directory where the gemBS commands are run, on the home file system. If this is not desired, however, it is possible to run gemBS entirely from a different disk using a combination of the --bind option to singularity and the --dir option to gemBS, which allows the working directory for gemBS to be specified.

The short perl script below shows how this could work in practice. The script should be edited so that $host_dir is set to a directory on the host disk, $singularity_dir set to the mount point in the container, and $container to the full path (on the host) to the gemBS container image. The script should be saved as gemBS, made executable and placed on a directory on the users PATH. When run, the script will use the --bind option to mount the required host disc. It also checks if the working directory is on the requested host disk (rather than being on the home file system), and will automatically set the --dir option for gemBS. Using the script it is possible to run gemBS on Singularity on a different filesystem from home in a more or less transparent manner.

#!/usr/bin/env perl
use strict;
use warnings;
use Cwd;

my $host_dir = "/disk1/heath";
my $singularity_mount_point= "/ext/disk1";
my $container = "/home/heath/heathsc-gemBS-master-latest.simg";

my $dcom = "";
my $wd = getcwd();

# Check if working directory is on same filesystem as $host_dir
if ((stat($wd))[0] == (stat($host_dir))[0]) {
  if($wd =~ m{^$host_dir/?(.*)}) {
    $dcom = "--dir $singularity_mount_point/$1";
exec("singularity exec --bind $host_dir:$singularity_mount_point $container /usr/local/bin/gemBS $dcom @ARGV");

10.2. Docker and gemBS

The first requirement is to install Docker, following the instructions on the Docker site. After that it is necessary to install the gemBS Docker container. The simplest way to do this is to pull the image from Docker hub.

docker pull heathsc/gembs-rs

This image has been built using this Dockerfile recipe from the root of the gemBS github repository. As an alternative to pulling the image from Docker hub, the image can be built using the docker command below (the Dockerfile should be in the current directory for this to work).

docker build -t 'heathsc/gembs-rs:latest' .

Once the container is installed, it is simplest to invoke the container using a simple script to mount the working directory into the container and the run the gemBS command. An example script is given below:

docker run \
-it \
--rm \
--mount type=bind,source="$(pwd)",destination=/home \
heathsc/gembs-rs:latest $@