R-Sapelo2: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
 
(17 intermediate revisions by 2 users not shown)
Line 5: Line 5:
Sapelo2
Sapelo2
=== Version ===
=== Version ===
3.5.0, 3.6.2, 4.0.0
4.2.1, 4.3.1, 4.3.2, 4.4.1, 4.4.2
 
===Author / Distributor===
===Author / Distributor===
See http://www.r-project.org/
See http://www.r-project.org/
Line 15: Line 16:
Also refer to [[Running Jobs on Sapelo2]]
Also refer to [[Running Jobs on Sapelo2]]


'''R version 4.4.2'''


'''R version 4.0.0'''
R version 4.4.2 is installed in /apps/eb/R/4.4.2-gfbf-2024a and add-on packages are installed in /apps/eb/R/4.4.2-gfbf-2024a/lib64/R/library. You can list all files in this dir to see the packages installed with this version of R.
 
R version 4.0.0 is installed in /apps/eb/R/4.0.0-foss-2019b and add-on packages are installed in /apps/eb/R/4.0.0-foss-2019b/lib64/R/library. You can list all files in this dir to see the packages installed with this version of R.


To use this version of R, please first load the module with
To use this version of R, please first load the module with
<pre class="gscript">
<pre class="gscript">
module load R/4.0.0-foss-2019b
module load R/4.4.2-gfbf-2024a
</pre>  
</pre>  
Please note that when you load this R module, many other modules will be loaded as R depends on those other libraries and packages.
Please note that when you load this R module, many other modules will be loaded as R depends on those other libraries and packages.


A number of R packages are provided by other module files. For example, many packages that depend on the GDAL library are provided by the rgdal/1.4-8-foss-2019b-R-4.0.0 module, which can be loaded with
A number of R packages are provided by other module files. For example, many CRAN packages are provided by the R-bundle-CRAN modules and many Bioconductor packages are provided by the R-bundle-Bioconductor modules. When you load an R-bundle-CRAN or an R-bundle-Bioconductor, a compatible R module will be automatically loaded. So if you to use Bioconductor 3.20, it suffices to load the module with:
<pre class="gscript">
<pre class="gscript">
module load rgdal/1.4-8-foss-2019b-R-4.0.0
module load R-bundle-Bioconductor/3.20-foss-2024a-R-4.4.2
</pre>
</pre>  


'''R version 3.6.2'''


R version 3.6.2 is installed in /apps/eb/R/3.6.2-foss-2019b and add-on packages are installed in /apps/eb/R/3.6.2-foss-2019b/lib64/R/library. You can list all files in this dir to see the packages installed with this version of R.
'''How to run R programs with multithreads'''


To use this version of R, please first load the module with
Some functions in the R packages are written with OpenMP and can run in parallel using multithreads. The R module files set the variable OMP_NUM_THREADS=1, to specify using only one thread. If you are using a function in an R package that is enabled to run with multithreads, please set the OMP_NUM_THREADS variable to the number of threads you want to use and request the same number of cores for your job. Note that you would need to set this variable '''after''' you load the R module, for it to have effect. For example, to use 6 threads, request 6 cores on the same node with
<pre class="gscript">
<pre class="gscript">
module load R/3.6.2-foss-2019b
#SBATCH --ntasks=1
</pre>  
#SBATCH --cpus-per-task=6
Please note that when you load this R module, many other modules will be loaded as R depends on those other libraries and packages.
</pre>
 
and set
A number of R packages are provided by other module files. For example, many packages that depend on the GDAL library are provided by the rgdal/1.4-8-foss-2019b-R-3.6.2 module, which can be loaded with
<pre class="gscript">
<pre class="gscript">
module load rgdal/1.4-8-foss-2019b-R-3.6.2
export OMP_NUM_THREADS=6
</pre>
</pre>


For R 3.6.2 some of the Bioconductor packages are installed in a separate R-bundle-Bioconductor/3.10-foss-2019b module, which can be loaded with
You can set OMP_NUM_THREADS to automatically correspond to the number of cores per task requested for the job, by setting it to $SLURM_CPUS_PER_TASK, with
<pre class="gscript">
<pre class="gscript">
module load R-bundle-Bioconductor/3.10-foss-2019b
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
</pre>
</pre>
A sample job submission to run a multithread job is provided below.




'''R version 3.5.0'''
'''Sample Job Submission Scripts'''
Here is an example of a shell script, sub.sh, to run a serial R script on the batch queue:
 
<pre class="gscript">
#!/bin/bash
#SBATCH --job-name=testRjob
#SBATCH --partition=batch       
#SBATCH --ntasks=1
#SBATCH --mem=10gb 
#SBATCH --time=08:00:00 
#SBATCH --output=%x.%j.out
#SBATCH --error=%x.%j.err
cd $SLURM_SUBMIT_DIR


R version 3.5.0 is installed in /apps/eb/R/3.5.0-foss-2019b and add-on packages are installed in /apps/eb/R/3.5.0-foss-2019b/lib64/R/library. You can list all files in this dir to see the packages installed with this version of R.
module load R/4.4.2-gfbf-2024a


To use this version of R, please first load the module with
R CMD BATCH program.R  
<pre class="gscript">
</pre>
module load R/3.5.0-foss-2019b
where program.R is just a sample program name and needs to be replaced by the name of the program you wish to run. The settings in the header lines, such as the job name, the email address, the number of cores, memory and walltime limit, need to be changed appropriately as well.
</pre>  
Please note that when you load this R module, many other modules will be loaded as R depends on those other libraries and packages.




Here is an example of a shell script, sub.sh, to run on the batch queue:  
Here is an example of a shell script, sub.sh, to run a multithread R script on the batch queue:  


<pre class="gscript">
<pre class="gscript">
#!/bin/bash<br>
#!/bin/bash
#SBATCH --job-name=testRjob
#SBATCH --job-name=testRjob
#SBATCH --partition=batch         
#SBATCH --partition=batch         
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu 
#SBATCH --ntasks=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=6
#SBATCH --mem=10gb   
#SBATCH --mem=10gb   
#SBATCH --time=08:00:00   
#SBATCH --time=08:00:00   
Line 79: Line 90:
cd $SLURM_SUBMIT_DIR
cd $SLURM_SUBMIT_DIR


module load R/4.0.0-foss-2019b 
module load R/4.4.2-gfbf-2024a 
 
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK


R CMD BATCH program.R   
R CMD BATCH program.R   
</pre>
</pre>
where program.R is just a sample program name and needs to be changed. The settings in the header lines, such as the job name, the email address, the number of cores, memory and walltime limit, need to be changed appropriately as well.
where program.R is just a sample program name and needs to be replaced by the name of the program you wish to run. The settings in the header lines, such as the job name, the email address, the number of cores, memory and walltime limit, need to be changed appropriately as well.





Latest revision as of 15:27, 9 January 2026

Category

Statistics

Program On

Sapelo2

Version

4.2.1, 4.3.1, 4.3.2, 4.4.1, 4.4.2

Author / Distributor

See http://www.r-project.org/

Description

R is a language and environment for statistical computing and graphics. It provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques.

Running Program

Also refer to Running Jobs on Sapelo2

R version 4.4.2

R version 4.4.2 is installed in /apps/eb/R/4.4.2-gfbf-2024a and add-on packages are installed in /apps/eb/R/4.4.2-gfbf-2024a/lib64/R/library. You can list all files in this dir to see the packages installed with this version of R.

To use this version of R, please first load the module with

module load R/4.4.2-gfbf-2024a

Please note that when you load this R module, many other modules will be loaded as R depends on those other libraries and packages.

A number of R packages are provided by other module files. For example, many CRAN packages are provided by the R-bundle-CRAN modules and many Bioconductor packages are provided by the R-bundle-Bioconductor modules. When you load an R-bundle-CRAN or an R-bundle-Bioconductor, a compatible R module will be automatically loaded. So if you to use Bioconductor 3.20, it suffices to load the module with:

module load R-bundle-Bioconductor/3.20-foss-2024a-R-4.4.2


How to run R programs with multithreads

Some functions in the R packages are written with OpenMP and can run in parallel using multithreads. The R module files set the variable OMP_NUM_THREADS=1, to specify using only one thread. If you are using a function in an R package that is enabled to run with multithreads, please set the OMP_NUM_THREADS variable to the number of threads you want to use and request the same number of cores for your job. Note that you would need to set this variable after you load the R module, for it to have effect. For example, to use 6 threads, request 6 cores on the same node with

#SBATCH --ntasks=1
#SBATCH --cpus-per-task=6

and set

export OMP_NUM_THREADS=6

You can set OMP_NUM_THREADS to automatically correspond to the number of cores per task requested for the job, by setting it to $SLURM_CPUS_PER_TASK, with

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

A sample job submission to run a multithread job is provided below.


Sample Job Submission Scripts

Here is an example of a shell script, sub.sh, to run a serial R script on the batch queue:

#!/bin/bash
#SBATCH --job-name=testRjob
#SBATCH --partition=batch        
#SBATCH --ntasks=1
#SBATCH --mem=10gb   
#SBATCH --time=08:00:00   
#SBATCH --output=%x.%j.out
#SBATCH --error=%x.%j.err
 
cd $SLURM_SUBMIT_DIR

module load R/4.4.2-gfbf-2024a 

R CMD BATCH program.R   

where program.R is just a sample program name and needs to be replaced by the name of the program you wish to run. The settings in the header lines, such as the job name, the email address, the number of cores, memory and walltime limit, need to be changed appropriately as well.


Here is an example of a shell script, sub.sh, to run a multithread R script on the batch queue:

#!/bin/bash
#SBATCH --job-name=testRjob
#SBATCH --partition=batch        
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=6
#SBATCH --mem=10gb   
#SBATCH --time=08:00:00   
#SBATCH --output=%x.%j.out
#SBATCH --error=%x.%j.err
 
cd $SLURM_SUBMIT_DIR

module load R/4.4.2-gfbf-2024a  

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

R CMD BATCH program.R   

where program.R is just a sample program name and needs to be replaced by the name of the program you wish to run. The settings in the header lines, such as the job name, the email address, the number of cores, memory and walltime limit, need to be changed appropriately as well.


Here is an example of job submission command:

sbatch ./sub.sh 

Documentation

Please see http://www.r-project.org/

Installation

Source code downloaded from https://www.r-project.org/

System

64-bit Linux