Tmp: Difference between revisions

From Research Computing Center Wiki
Jump to navigation Jump to search
No edit summary
(Blanked the page)
Tag: Blanking
 
(21 intermediate revisions by the same user not shown)
Line 1: Line 1:
__TOC__


= Pending or Running Jobs =
The easiest way to monitor pending or running jobs is with the Slurm <code>squeue</code> command.  Like most Slurm commands, you are able to control the columns displayed in the output of this command (see <code>man squeue</code> for more information).  To save you that trouble and to make things more convenient, we've created the <code>sq</code> command, which is <code>squeue</code> but pre-formatted and with some additional options for convenience.
The default <code>squeue</code> columns are as follows:
<pre class="gcomment">
JOBID PARTITION    NAME    USER ST      TIME  NODES NODELIST(REASON)
</pre>
Using <code>sq</code> runs the <code>squeue</code> command but provides the following columns:
<pre class="gcomment">
JOBID      TIME            TIME_LIMIT      NAME            PARTITION        USER      NODES  CPUS  MIN_MEMORY  PRIORITY  STATE      NODELIST(REASON)
</pre>
As you can see, you're able to get much more useful information with <code>sq</code> than with just the default <code>squeue</code> formatting. 
'''Output Columns Explained'''
* '''JOBID''': The unique ID of the job.
* '''TIME''': How much (wall) time has elapsed since the job started, in the format DAYS-HOURS:MINUTES:SECONDS
* '''TIME_LIMIT''': The maximum time given for the job to run, in the format DAYS-HOURS:MINUTES:SECONDS.
* '''NAME''': The name of the job.  If not specified in one's submission script, it will default to the name of the submission script (e.g. "sub.sh").
* '''PARTITION''': The partition to which the job was sent (e.g. batch, highmem_p, gpu_p, etc...).
* '''USER''': The user who submitted the job.
* '''NODES''': The number of nodes allocated to the job.
* '''CPUS''': The number of CPU cores allocated to the job.
* '''MIN_MEMORY''': The amount of memory allocated to the job.
* '''PRIORITY''': The job's priority per Slurm's [https://slurm.schedmd.com/priority_multifactor.html Multifactor Priority Plugin]
* '''STATE''': The job's state (e.g. Running, Pending, etc...)
* '''NODELIST(REASON)''': The name of the node(s) on which the job is running or the reason the job has not started yet, if it is pending.
<code>sq</code> also has a -h/--help option:
<pre class="gcomment">
bc06026@ss-sub3 ~$ sq --help
sq - preformatted wrapper for squeue.  See man squeue for more information.
sq - preformatted wrapper for squeue.  See man squeue for more information.
Usage: sq [-T][-p PARTITION][-u USER][--me][-h | --help]
-T displays submit and start time columns
-p displays squeue output for a given partition
-u displays squeue output for a given user
--me displays squeue output for the user executing this command
-h displays this help output
--help displays this help output
</pre>
<big><big>'''Examples'''</big></big>
* See all pending and running jobs: <code>sq</code>
* See all of your pending and running jobs: <code>sq --me</code>
* See all pending and running jobs in the highmem_p: <code>sq -p highmem_p</code>
* See all of your pending and running jobs in the batch partition: <code>sq --me -p batch</code>
* See all of your pending and  running jobs including submit time and start time columns: <code>sq --me -T</code> (Note, this will require a wide monitor or small font to display without columns wrapping around)
<big>'''Example <code>sq</code> output:'''</big>
<pre class="gcomment">
bc06026@ss-sub3 ~$ sq
JOBID      TIME            TIME_LIMIT      NAME            PARTITION        USER      NODES  CPUS  MIN_MEMORY  PRIORITY  STATE      NODELIST(REASON)   
4581410    2:10:56        10:00:00        Bowtie2-test    batch            zp21982    1      1      12G          6003      RUNNING    c5-4             
4584815    1:51:03        2:00:00        test-job        highmem_p        rt12352    1      12    300G        5473      RUNNING    d3-9             
4578428    4:57:15        1-2:00:00      PR6_Cd3        batch            un12354    1      1      40G          5449      RUNNING    c4-16             
4583491    1:57:38        12:00:00        interact        inter_p          ai38821    1      4      2G          5428      RUNNING    d5-21             
4580374    2:54:41        12:00:00        BLAST          batch            gh98762    1      1      10G          5397      RUNNING    b1-9
...
</pre>
----
[[#top|Back to Top]]
= Previously Ran Jobs =
''Insert info about sacct/sacct-gacrc here''

Latest revision as of 11:27, 17 September 2021