Singularity: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
No edit summary
 
(3 intermediate revisions by one other user not shown)
Line 5: Line 5:


=== Loading Singularity ===
=== Loading Singularity ===
Singularity is installed on all compute nodes on Sapelo2. In order to access that executables you have to load the Singularity module.
Singularity is installed on all compute nodes on Sapelo2 and you don't need to load any modules in order to access it. For example, you can run the following from an interactive session or in a batch job:
<pre>
<pre>
$ module load Singularity
$ singularity --help
$ singularity --help
USAGE: singularity [global options...] <command> [command options...] ...
USAGE: singularity [global options...] <command> [command options...] ...
Line 51: Line 50:


</pre>
</pre>
More information on how to use Singularity on Sapelo2 is available on the [[Software on Sapelo2]] page.


=== Using BioContainer containers ===
=== Using BioContainer containers ===
Line 104: Line 105:


You now have a blast.simg image that can be used to run blast using Singularity.
You now have a blast.simg image that can be used to run blast using Singularity.
===== Accessing the cluster filesystem from inside the container =====
The Singularity application on Sapelo2 has been pre-configured to mount the /home/$USER, /db, /lscratch, and /lustre1 filesystems inside the container.
<pre>
$ singularity exec blast.simg df -h
Filesystem                                Size  Used Avail Use% Mounted on
OverlayFS                                  1.0M    0  1.0M  0% /
/dev/mapper/vg00-lv.root                    20G  7.3G  13G  37% /tmp
/dev/mapper/vg00-lv.lscratch              869G  33M  869G  1% /lscratch
10.55.49.1@o2ib:10.55.49.2@o2ib:/csx0009g  452T  360T  88T  81% /lustre1
sn0.storage:/storage/xcluster/db            27T  3.7T  23T  14% /db
devtmpfs                                    32G    0  32G  0% /dev
tmpfs                                      32G    0  32G  0% /dev/shm
sn0.storage:/storage/xcluster/home/raj76    97G  24G  74G  25% /home/raj76
tmpfs                                      16M  8.0K  16M  1% /etc/group
</pre>


===== Running Blast =====
===== Running Blast =====
Line 120: Line 137:
</pre>
</pre>
====== Preparing the database ======
====== Preparing the database ======
We need to prepare the zebrafish database with '''makeblastdb''' for the search. Before we do that let's understand how to access the Sapelo2 file systems from inside the container.
We need to prepare the zebrafish database with '''makeblastdb''' for the search.
Execute the following command:
<pre>
<pre>
$ singularity exec blast.simg df -h
$ singularity exec blast.simg makeblastdb -in zebrafish.1.protein.faa -dbtype prot
Filesystem                                Size Used Avail Use% Mounted on
$ ls -lt
OverlayFS                                1.0M    0 1.0M  0% /
total 779935
/dev/mapper/vg00-lv.root                  234G  7.8G  227G  4% /tmp
-rw-r--r-- 1 raj76 jlmlab 38605311 Dec 22 15:25 zebrafish.1.protein.faa.psq
devtmpfs                                  63G     0  63G  0% /dev
-rw-r--r-- 1 raj76 jlmlab  7101626 Dec 22 15:25 zebrafish.1.protein.faa.phr
tmpfs                                      63G 428M  63G  1% /dev/shm
-rw-r--r-- 1 raj76 jlmlab    424888 Dec 22 15:25 zebrafish.1.protein.faa.pin
sn0.storage:/storage/xcluster/home/raj76  97G  24G  74G  25% /home/raj76
-rw-r--r-- 1 raj76 jlmlab 42849679 Dec 22 11:47 zebrafish.1.protein.faa
tmpfs                                      16M 8.0K  16M  1% /etc/group
-rwxr-xr-x 1 raj76 jlmlab 753582111 Dec 22 11:38 blast.simg
-rw-r--r-- 1 raj76 jlmlab      334 Nov 21 19:00 P04156.fasta
</pre>
 
====== Running blastp search ======
Now that we have a database created we can do an alignment against it as follows.
<pre>
$ singularity exec blast.simg blastp -query P04156.fasta -db zebrafish.1.protein.faa -out results.txt
$ ls -lt
total 779936
-rw-r--r-- 1 raj76 jlmlab     17515 Dec 22 15:28 results.txt
-rw-r--r-- 1 raj76 jlmlab 38605311 Dec 22 15:25 zebrafish.1.protein.faa.psq
-rw-r--r-- 1 raj76 jlmlab   7101626 Dec 22 15:25 zebrafish.1.protein.faa.phr
-rw-r--r-- 1 raj76 jlmlab    424888 Dec 22 15:25 zebrafish.1.protein.faa.pin
-rw-r--r-- 1 raj76 jlmlab 42849679 Dec 22 11:47 zebrafish.1.protein.faa
-rwxr-xr-x 1 raj76 jlmlab 753582111 Dec 22 11:38 blast.simg
-rw-r--r-- 1 raj76 jlmlab      334 Nov 21 19:00 P04156.fasta
</pre>
</pre>
You will see in the output above that my home directory '''(/home/raj76)''' is mounted and accessible from inside the container. By default, Singularity mounts the users home directory inside the container. It will also let us mount other directories inside the container.
 
The files results.txt has the blastp output.
 
You could have used the existing blast databases in /db to perform the alignment by specifying
<pre>
<pre>
$ singularity exec -B /db -B /lustre1 -B /lscratch  blast.simg df -h
$ singularity exec blast.simg blastp -query P04156.fasta -db /db/uniprot/latest/uniprot_sprot -out results2.txt
Filesystem                                Size  Used Avail Use% Mounted on
$ ls -lth
OverlayFS                                  1.0M    0 1.0M  0% /
total 762M
/dev/mapper/vg00-lv.root                    20G  7.3G  13G  37% /tmp
-rw-r--r-- 1 raj76 jlmlab 110K Dec 22 15:34 results2.txt
devtmpfs                                    32G    0  32G  0% /dev
-rw-r--r-- 1 raj76 jlmlab 18K Dec 22 15:28 results.txt
tmpfs                                      32G    0  32G  0% /dev/shm
-rw-r--r-- 1 raj76 jlmlab  37M Dec 22 15:25 zebrafish.1.protein.faa.psq
sn0.storage:/storage/xcluster/home/raj76    97G  24G  74G  25% /home/raj76
-rw-r--r-- 1 raj76 jlmlab 6.8M Dec 22 15:25 zebrafish.1.protein.faa.phr
sn0.storage:/storage/xcluster/db            27T  3.7T  23T  14% /db
-rw-r--r-- 1 raj76 jlmlab 415K Dec 22 15:25 zebrafish.1.protein.faa.pin
10.55.49.1@o2ib:10.55.49.2@o2ib:/csx0009g  452T  361T  87T  81% /lustre1
-rw-r--r-- 1 raj76 jlmlab  41M Dec 22 11:47 zebrafish.1.protein.faa
/dev/mapper/vg00-lv.lscratch              869G  33M  869G  1% /lscratch
-rwxr-xr-x 1 raj76 jlmlab 719M Dec 22 11:38 blast.simg
tmpfs                                      16M 8.0K  16M  1% /etc/group
-rw-r--r-- 1 raj76 jlmlab 334 Nov 21 19:00 P04156.fasta
</pre>
</pre>
In the above example, the '''-B /db''' option to the exec command.

Latest revision as of 10:02, 10 January 2022

Using Singularity containers on Sapelo2

The Sapelo2 cluster has the ability to run Singularity containers. Singularity containers are docker like containers that are HPC friendly.


Loading Singularity

Singularity is installed on all compute nodes on Sapelo2 and you don't need to load any modules in order to access it. For example, you can run the following from an interactive session or in a batch job:

$ singularity --help
USAGE: singularity [global options...] <command> [command options...] ...

GLOBAL OPTIONS:
    -d|--debug    Print debugging information
    -h|--help     Display usage summary
    -s|--silent   Only print errors
    -q|--quiet    Suppress all normal output
       --version  Show application version
    -v|--verbose  Increase verbosity +1
    -x|--sh-debug Print shell wrapper debugging information

GENERAL COMMANDS:
    help       Show additional help for a command or container                  
    selftest   Run some self tests for singularity install                      

CONTAINER USAGE COMMANDS:
    exec       Execute a command within container                               
    run        Launch a runscript within container                              
    shell      Run a Bourne shell within container                              
    test       Launch a testscript within container                             

CONTAINER MANAGEMENT COMMANDS:
    apps       List available apps within a container                           
    bootstrap  *Deprecated* use build instead                                   
    build      Build a new Singularity container                                
    check      Perform container lint checks                                    
    inspect    Display container's metadata                                     
    mount      Mount a Singularity container image                              
    pull       Pull a Singularity/Docker container to $PWD                      

COMMAND GROUPS:
    image      Container image command group                                    
    instance   Persistent instance command group                                


CONTAINER USAGE OPTIONS:
    see singularity help <command>

For any additional help or support visit the Singularity
website: http://singularity.lbl.gov/

More information on how to use Singularity on Sapelo2 is available on the Software on Sapelo2 page.

Using BioContainer containers

BioContainers is an open source and community-driven framework which provides system-agnostic executable environments for bioinformatics software. BioContainers framework allows software to be installed and executed under an isolated and controllable environment.

Blast container example

This example is based on the BioContainer example at http://biocontainers.pro/docs/101/running-example/ webpage.


Pulling a container

In order to use a pre built container, one has to "pull" the container from the container registry. The command to pull a blast container from BioContainer docker registry is as below:

$ mkdir /home/raj76/test
$ cd /home/raj76/test
$ singularity  pull docker://biocontainers/blast
WARNING: pull for Docker Hub is not guaranteed to produce the
WARNING: same image on repeated pull. Use Singularity Registry
WARNING: (shub://) to pull exactly equivalent images.
Docker image path: index.docker.io/biocontainers/blast:latest
Cache folder set to /home/raj76/.singularity/docker
[1/1] |===================================| 100.0% 
Importing: base Singularity environment
Importing: /home/raj76/.singularity/docker/sha256:660c48dd555dcbfdfe19c80a30f557ac57a15f595250e67bfad1e5663c1725bb.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:4c7380416e7816a5ab1f840482c9c3ca8de58c6f3ee7f95e55ad299abbfe599f.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:421e436b5f80d876128b74139531693be9b4e59e4f1081c9a3c379c95094e375.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:e4ce6c3651b3a090bb43688f512f687ea6e3e533132bcbc4a83fb97e7046cea3.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:be588e74bd348ce48bb7161350f4b9d783c331f37a853a80b0b4abc0a33c569e.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:7513e23e94e042f38785a66fba7302331c39e92491ea8057a6d0ec2094fda7c0.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:1f1169998bd08c33903c56348ad438f90e69d517d0507b7fba7432decee61eb8.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:3c20a4bba592a988108453b44eefbd6fe2022151542b9883221b658145311681.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:9bfd0812e2d6aae116904d6c301511ef441b26e766811bcb82528941b416ed1b.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:1c6ae521538275a62e5e949c49ca7ca611477d8638df15b6c6ab83b76d01da2f.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:7d1b4609c9a59e46ac2c106aa79b4707bd905526df25288f595d90bf01b6e26c.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:a8022c247944f5c801b34c391e19aa1792d20b3d55d6a55821cf757c0df43e80.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:10c8f0ff4e5a382068d7480f76ce323b131721ae960b880c64ba94f6c856c77a.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:a8823f25a74a25d2943097ead4b7a90fc1dadcdd40dce298097ea52fc617981e.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:aedae76aecd1555418bdb6d202c3464aebad9c44d366533218fb21b883570d2a.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:f3124b62dbfe35c1349d0c38f1df851c0e536a00f15d085e53e3031eff2a94a5.tar.gz
Importing: /home/raj76/.singularity/docker/sha256:d3f0d697543a6e83f259022cfa1b74e324b523a7e02057774df35751a5881bc2.tar.gz
Importing: /home/raj76/.singularity/metadata/sha256:944beb8dfe363519e9e9ba35cc76ba618aa58d15d1ed941b6a9b0a74fef88746.tar.gz
WARNING: Building container as an unprivileged user. If you run this container as root
WARNING: it may be missing some functionality.
Building Singularity image...
Singularity container built: ./blast.simg
Cleaning up...
Done. Container is at: ./blast.simg
$ ls
blast.simg
$

You now have a blast.simg image that can be used to run blast using Singularity.

Accessing the cluster filesystem from inside the container

The Singularity application on Sapelo2 has been pre-configured to mount the /home/$USER, /db, /lscratch, and /lustre1 filesystems inside the container.

$ singularity exec blast.simg df -h
Filesystem                                 Size  Used Avail Use% Mounted on
OverlayFS                                  1.0M     0  1.0M   0% /
/dev/mapper/vg00-lv.root                    20G  7.3G   13G  37% /tmp
/dev/mapper/vg00-lv.lscratch               869G   33M  869G   1% /lscratch
10.55.49.1@o2ib:10.55.49.2@o2ib:/csx0009g  452T  360T   88T  81% /lustre1
sn0.storage:/storage/xcluster/db            27T  3.7T   23T  14% /db
devtmpfs                                    32G     0   32G   0% /dev
tmpfs                                       32G     0   32G   0% /dev/shm
sn0.storage:/storage/xcluster/home/raj76    97G   24G   74G  25% /home/raj76
tmpfs                                       16M  8.0K   16M   1% /etc/group
Running Blast
$ singularity exec blast.simg blastp -help

This will print the help page for blastp tool. The command singularity exec blast.simg tells singularity that it is going to execute the command that follows inside the blast.simg container.

Downloading the example datasets

Let us download the example datasets to run blast.

$ wget http://www.uniprot.org/uniprot/P04156.fasta
$ curl -O ftp://ftp.ncbi.nih.gov/refseq/D_rerio/mRNA_Prot/zebrafish.1.protein.faa.gz
$ gunzip zebrafish.1.protein.faa.gz
$ ls
blast.simg  P04156.fasta  zebrafish.1.protein.faa
Preparing the database

We need to prepare the zebrafish database with makeblastdb for the search.

$ singularity exec blast.simg makeblastdb -in zebrafish.1.protein.faa -dbtype prot
$ ls -lt
total 779935
-rw-r--r-- 1 raj76 jlmlab  38605311 Dec 22 15:25 zebrafish.1.protein.faa.psq
-rw-r--r-- 1 raj76 jlmlab   7101626 Dec 22 15:25 zebrafish.1.protein.faa.phr
-rw-r--r-- 1 raj76 jlmlab    424888 Dec 22 15:25 zebrafish.1.protein.faa.pin
-rw-r--r-- 1 raj76 jlmlab  42849679 Dec 22 11:47 zebrafish.1.protein.faa
-rwxr-xr-x 1 raj76 jlmlab 753582111 Dec 22 11:38 blast.simg
-rw-r--r-- 1 raj76 jlmlab       334 Nov 21 19:00 P04156.fasta
Running blastp search

Now that we have a database created we can do an alignment against it as follows.

$ singularity exec blast.simg blastp -query P04156.fasta -db zebrafish.1.protein.faa -out results.txt
$ ls -lt
total 779936
-rw-r--r-- 1 raj76 jlmlab     17515 Dec 22 15:28 results.txt
-rw-r--r-- 1 raj76 jlmlab  38605311 Dec 22 15:25 zebrafish.1.protein.faa.psq
-rw-r--r-- 1 raj76 jlmlab   7101626 Dec 22 15:25 zebrafish.1.protein.faa.phr
-rw-r--r-- 1 raj76 jlmlab    424888 Dec 22 15:25 zebrafish.1.protein.faa.pin
-rw-r--r-- 1 raj76 jlmlab  42849679 Dec 22 11:47 zebrafish.1.protein.faa
-rwxr-xr-x 1 raj76 jlmlab 753582111 Dec 22 11:38 blast.simg
-rw-r--r-- 1 raj76 jlmlab       334 Nov 21 19:00 P04156.fasta

The files results.txt has the blastp output.

You could have used the existing blast databases in /db to perform the alignment by specifying

$ singularity exec blast.simg blastp -query P04156.fasta -db /db/uniprot/latest/uniprot_sprot -out results2.txt
$ ls -lth
total 762M
-rw-r--r-- 1 raj76 jlmlab 110K Dec 22 15:34 results2.txt
-rw-r--r-- 1 raj76 jlmlab  18K Dec 22 15:28 results.txt
-rw-r--r-- 1 raj76 jlmlab  37M Dec 22 15:25 zebrafish.1.protein.faa.psq
-rw-r--r-- 1 raj76 jlmlab 6.8M Dec 22 15:25 zebrafish.1.protein.faa.phr
-rw-r--r-- 1 raj76 jlmlab 415K Dec 22 15:25 zebrafish.1.protein.faa.pin
-rw-r--r-- 1 raj76 jlmlab  41M Dec 22 11:47 zebrafish.1.protein.faa
-rwxr-xr-x 1 raj76 jlmlab 719M Dec 22 11:38 blast.simg
-rw-r--r-- 1 raj76 jlmlab  334 Nov 21 19:00 P04156.fasta