Difference between revisions of "Singularity"

From Research Computing Center Wiki
Jump to navigation Jump to search
Line 107: Line 107:
 
===== Accessing the cluster filesystem from inside the container =====
 
===== Accessing the cluster filesystem from inside the container =====
 
The Singularity application on Sapelo2 has been pre-configured to mount the /home/$USER, /db, /lscratch, and /lustre1 filesystems inside the container.  
 
The Singularity application on Sapelo2 has been pre-configured to mount the /home/$USER, /db, /lscratch, and /lustre1 filesystems inside the container.  
 +
<pre>
 +
$ singularity exec blast.simg df -h
 +
Filesystem                                Size  Used Avail Use% Mounted on
 +
OverlayFS                                  1.0M    0  1.0M  0% /
 +
/dev/mapper/vg00-lv.root                    20G  7.3G  13G  37% /tmp
 +
/dev/mapper/vg00-lv.lscratch              869G  33M  869G  1% /lscratch
 +
10.55.49.1@o2ib:10.55.49.2@o2ib:/csx0009g  452T  360T  88T  81% /lustre1
 +
sn0.storage:/storage/xcluster/db            27T  3.7T  23T  14% /db
 +
devtmpfs                                    32G    0  32G  0% /dev
 +
tmpfs                                      32G    0  32G  0% /dev/shm
 +
sn0.storage:/storage/xcluster/home/raj76    97G  24G  74G  25% /home/raj76
 +
tmpfs                                      16M  8.0K  16M  1% /etc/group
 +
</pre>
  
 
===== Running Blast =====
 
===== Running Blast =====
Line 124: Line 137:
 
====== Preparing the database ======
 
====== Preparing the database ======
 
We need to prepare the zebrafish database with '''makeblastdb''' for the search.
 
We need to prepare the zebrafish database with '''makeblastdb''' for the search.
 +
<pre>
 +
$ singularity exec blast.simg makeblastdb -in zebrafish.1.protein.faa -dbtype prot
 +
$ ls -lt
 +
total 779935
 +
-rw-r--r-- 1 raj76 jlmlab  38605311 Dec 22 15:25 zebrafish.1.protein.faa.psq
 +
-rw-r--r-- 1 raj76 jlmlab  7101626 Dec 22 15:25 zebrafish.1.protein.faa.phr
 +
-rw-r--r-- 1 raj76 jlmlab    424888 Dec 22 15:25 zebrafish.1.protein.faa.pin
 +
-rw-r--r-- 1 raj76 jlmlab  42849679 Dec 22 11:47 zebrafish.1.protein.faa
 +
-rwxr-xr-x 1 raj76 jlmlab 753582111 Dec 22 11:38 blast.simg
 +
-rw-r--r-- 1 raj76 jlmlab      334 Nov 21 19:00 P04156.fasta
 +
</pre>
 +
 +
====== Preparing the database ======
 +
Now that we have a database created we can do an alignment against it as follows.
 +
<pre>
 +
$ singularity exec blast.simg blastp -query P04156.fasta -db zebrafish.1.protein.faa -out results.txt
 +
$ ls -lt
 +
total 779936
 +
-rw-r--r-- 1 raj76 jlmlab    17515 Dec 22 15:28 results.txt
 +
-rw-r--r-- 1 raj76 jlmlab  38605311 Dec 22 15:25 zebrafish.1.protein.faa.psq
 +
-rw-r--r-- 1 raj76 jlmlab  7101626 Dec 22 15:25 zebrafish.1.protein.faa.phr
 +
-rw-r--r-- 1 raj76 jlmlab    424888 Dec 22 15:25 zebrafish.1.protein.faa.pin
 +
-rw-r--r-- 1 raj76 jlmlab  42849679 Dec 22 11:47 zebrafish.1.protein.faa
 +
-rwxr-xr-x 1 raj76 jlmlab 753582111 Dec 22 11:38 blast.simg
 +
-rw-r--r-- 1 raj76 jlmlab      334 Nov 21 19:00 P04156.fasta
 +
</pre>
 +
 +
The files results.txt has the blastp output.
 +
 +
You could have used the existing blast databases in /db to perform the alignment by specifying

Revision as of 16:34, 22 December 2017

Using Singularity containers on Sapelo2

The Sapelo2 cluster has the ability to run Singularity containers. Singularity containers are docker like containers that are HPC friendly.


Loading Singularity

Singularity is installed on all compute nodes on Sapelo2. In order to access that executables you have to load the Singularity module.

$ module load Singularity
$ singularity --help
USAGE: singularity [global options...] <command> [command options...] ...

GLOBAL OPTIONS:
    -d|--debug    Print debugging information
    -h|--help     Display usage summary
    -s|--silent   Only print errors
    -q|--quiet    Suppress all normal output
       --version  Show application version
    -v|--verbose  Increase verbosity +1
    -x|--sh-debug Print shell wrapper debugging information

GENERAL COMMANDS:
    help       Show additional help for a command or container                  
    selftest   Run some self tests for singularity install                      

CONTAINER USAGE COMMANDS:
    exec       Execute a command within container                               
    run        Launch a runscript within container                              
    shell      Run a Bourne shell within container                              
    test       Launch a testscript within container                             

CONTAINER MANAGEMENT COMMANDS:
    apps       List available apps within a container                           
    bootstrap  *Deprecated* use build instead                                   
    build      Build a new Singularity container                                
    check      Perform container lint checks                                    
    inspect    Display container's metadata                                     
    mount      Mount a Singularity container image                              
    pull       Pull a Singularity/Docker container to $PWD                      

COMMAND GROUPS:
    image      Container image command group                                    
    instance   Persistent instance command group                                


CONTAINER USAGE OPTIONS:
    see singularity help <command>

For any additional help or support visit the Singularity
website: http://singularity.lbl.gov/

Using BioContainer containers

BioContainers is an open source and community-driven framework which provides system-agnostic executable environments for bioinformatics software. BioContainers framework allows software to be installed and executed under an isolated and controllable environment.

Blast container example

This example is based on the BioContainer example at http://biocontainers.pro/docs/101/running-example/ webpage.


Pulling a container

In order to use a pre built container, one has to "pull" the container from the container registry. The command to pull a blast container from BioContainer docker registry is as below:

$ mkdir /home/raj76/test
$ cd /home/raj76/test
$ singularity  pull docker://biocontainers/blast
WARNING: pull for Docker Hub is not guaranteed to produce the
WARNING: same image on repeated pull. Use Singularity Registry
WARNING: (shub://) to pull exactly equivalent images.
Docker image path: index.docker.io/biocontainers/blast:latest
Cache folder set to /home/raj76/.singularity/docker
[1/1] |===================================| 100.0% 
Importing: base Singularity environment
Importing: /home/raj76/.singularity/docker/sha256:660c48dd555dcbfdfe19c80a30f557ac57a15f595250e67bfad1e5663c1725bb.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:4c7380416e7816a5ab1f840482c9c3ca8de58c6f3ee7f95e55ad299abbfe599f.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:421e436b5f80d876128b74139531693be9b4e59e4f1081c9a3c379c95094e375.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:e4ce6c3651b3a090bb43688f512f687ea6e3e533132bcbc4a83fb97e7046cea3.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:be588e74bd348ce48bb7161350f4b9d783c331f37a853a80b0b4abc0a33c569e.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:7513e23e94e042f38785a66fba7302331c39e92491ea8057a6d0ec2094fda7c0.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:1f1169998bd08c33903c56348ad438f90e69d517d0507b7fba7432decee61eb8.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:3c20a4bba592a988108453b44eefbd6fe2022151542b9883221b658145311681.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:9bfd0812e2d6aae116904d6c301511ef441b26e766811bcb82528941b416ed1b.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:1c6ae521538275a62e5e949c49ca7ca611477d8638df15b6c6ab83b76d01da2f.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:7d1b4609c9a59e46ac2c106aa79b4707bd905526df25288f595d90bf01b6e26c.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:a8022c247944f5c801b34c391e19aa1792d20b3d55d6a55821cf757c0df43e80.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:10c8f0ff4e5a382068d7480f76ce323b131721ae960b880c64ba94f6c856c77a.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:a8823f25a74a25d2943097ead4b7a90fc1dadcdd40dce298097ea52fc617981e.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:aedae76aecd1555418bdb6d202c3464aebad9c44d366533218fb21b883570d2a.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:f3124b62dbfe35c1349d0c38f1df851c0e536a00f15d085e53e3031eff2a94a5.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:d3f0d697543a6e83f259022cfa1b74e324b523a7e02057774df35751a5881bc2.tar.gz
Importing: /home/raj76/.singularity/metadata/sha256:944beb8dfe363519e9e9ba35cc76ba618aa58d15d1ed941b6a9b0a74fef88746.tar.gz
WARNING: Building container as an unprivileged user. If you run this container as root
WARNING: it may be missing some functionality.
Building Singularity image...
Singularity container built: ./blast.simg
Cleaning up...
Done. Container is at: ./blast.simg
$ ls
blast.simg
$

You now have a blast.simg image that can be used to run blast using Singularity.

Accessing the cluster filesystem from inside the container

The Singularity application on Sapelo2 has been pre-configured to mount the /home/$USER, /db, /lscratch, and /lustre1 filesystems inside the container.

$ singularity exec blast.simg df -h
Filesystem                                 Size  Used Avail Use% Mounted on
OverlayFS                                  1.0M     0  1.0M   0% /
/dev/mapper/vg00-lv.root                    20G  7.3G   13G  37% /tmp
/dev/mapper/vg00-lv.lscratch               869G   33M  869G   1% /lscratch
10.55.49.1@o2ib:10.55.49.2@o2ib:/csx0009g  452T  360T   88T  81% /lustre1
sn0.storage:/storage/xcluster/db            27T  3.7T   23T  14% /db
devtmpfs                                    32G     0   32G   0% /dev
tmpfs                                       32G     0   32G   0% /dev/shm
sn0.storage:/storage/xcluster/home/raj76    97G   24G   74G  25% /home/raj76
tmpfs                                       16M  8.0K   16M   1% /etc/group
Running Blast
$ singularity exec blast.simg blastp -help

This will print the help page for blastp tool. The command singularity exec blast.simg tells singularity that it is going to execute the command that follows inside the blast.simg container.

Downloading the example datasets

Let us download the example datasets to run blast.

$ wget http://www.uniprot.org/uniprot/P04156.fasta
$ curl -O ftp://ftp.ncbi.nih.gov/refseq/D_rerio/mRNA_Prot/zebrafish.1.protein.faa.gz
$ gunzip zebrafish.1.protein.faa.gz
$ ls
blast.simg  P04156.fasta  zebrafish.1.protein.faa
Preparing the database

We need to prepare the zebrafish database with makeblastdb for the search.

$ singularity exec blast.simg makeblastdb -in zebrafish.1.protein.faa -dbtype prot
$ ls -lt
total 779935
-rw-r--r-- 1 raj76 jlmlab  38605311 Dec 22 15:25 zebrafish.1.protein.faa.psq
-rw-r--r-- 1 raj76 jlmlab   7101626 Dec 22 15:25 zebrafish.1.protein.faa.phr
-rw-r--r-- 1 raj76 jlmlab    424888 Dec 22 15:25 zebrafish.1.protein.faa.pin
-rw-r--r-- 1 raj76 jlmlab  42849679 Dec 22 11:47 zebrafish.1.protein.faa
-rwxr-xr-x 1 raj76 jlmlab 753582111 Dec 22 11:38 blast.simg
-rw-r--r-- 1 raj76 jlmlab       334 Nov 21 19:00 P04156.fasta
Preparing the database

Now that we have a database created we can do an alignment against it as follows.

$ singularity exec blast.simg blastp -query P04156.fasta -db zebrafish.1.protein.faa -out results.txt
$ ls -lt
total 779936
-rw-r--r-- 1 raj76 jlmlab     17515 Dec 22 15:28 results.txt
-rw-r--r-- 1 raj76 jlmlab  38605311 Dec 22 15:25 zebrafish.1.protein.faa.psq
-rw-r--r-- 1 raj76 jlmlab   7101626 Dec 22 15:25 zebrafish.1.protein.faa.phr
-rw-r--r-- 1 raj76 jlmlab    424888 Dec 22 15:25 zebrafish.1.protein.faa.pin
-rw-r--r-- 1 raj76 jlmlab  42849679 Dec 22 11:47 zebrafish.1.protein.faa
-rwxr-xr-x 1 raj76 jlmlab 753582111 Dec 22 11:38 blast.simg
-rw-r--r-- 1 raj76 jlmlab       334 Nov 21 19:00 P04156.fasta

The files results.txt has the blastp output.

You could have used the existing blast databases in /db to perform the alignment by specifying