Tmp

From Research Computing Center Wiki
Revision as of 13:43, 14 September 2021 by Ben (talk | contribs) (Created page with "__TOC__ = Pending or Running Jobs = The easiest way to monitor pending or running jobs is with the Slurm <code>squeue</code> command. Like most Slurm commands, you are able...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Pending or Running Jobs

The easiest way to monitor pending or running jobs is with the Slurm squeue command. Like most Slurm commands, you are able to control the columns displayed in the output of this command (see man squeue for more information). To save you that trouble and to make things more convenient, we've created the sq command, which is squeue but pre-formatted and with some additional options for convenience.

The default squeue columns are as follows:

JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)

Using sq runs the squeue command but provides the following columns:

JOBID      TIME            TIME_LIMIT      NAME            PARTITION        USER       NODES  CPUS   MIN_MEMORY   PRIORITY   STATE      NODELIST(REASON)

As you can see, you're able to get much more useful information with sq than with just the default squeue formatting.


Output Columns Explained

  • JOBID: The unique ID of the job.
  • TIME: How much (wall) time has elapsed since the job started, in the format DAYS-HOURS:MINUTES:SECONDS
  • TIME_LIMIT: The maximum time given for the job to run, in the format DAYS-HOURS:MINUTES:SECONDS.
  • NAME: The name of the job. If not specified in one's submission script, it will default to the name of the submission script (e.g. "sub.sh").
  • PARTITION: The partition to which the job was sent (e.g. batch, highmem_p, gpu_p, etc...).
  • USER: The user who submitted the job.
  • NODES: The number of nodes allocated to the job.
  • CPUS: The number of CPU cores allocated to the job.
  • MIN_MEMORY: The amount of memory allocated to the job.
  • PRIORITY: The job's priority per Slurm's Multifactor Priority Plugin
  • STATE: The job's state (e.g. Running, Pending, etc...)
  • NODELIST(REASON): The name of the node(s) on which the job is running or the reason the job has not started yet, if it is pending.


sq also has a -h/--help option:

bc06026@ss-sub3 ~$ sq --help

sq - preformatted wrapper for squeue.  See man squeue for more information.

sq - preformatted wrapper for squeue.  See man squeue for more information.

Usage: sq [-T][-p PARTITION][-u USER][--me][-h | --help]

-T	displays submit and start time columns
-p	displays squeue output for a given partition
-u	displays squeue output for a given user
--me	displays squeue output for the user executing this command
-h	displays this help output
--help	displays this help output


Examples

  • See all pending and running jobs: sq
  • See all of your pending and running jobs: sq --me
  • See all pending and running jobs in the highmem_p: sq -p highmem_p
  • See all of your pending and running jobs in the batch partition: sq --me -p batch
  • See all of your pending and running jobs including submit time and start time columns: sq --me -T (Note, this will require a wide monitor or small font to display without columns wrapping around)



Back to Top

Previously Ran Jobs

Insert info about sacct here