Actions

Difference between revisions of "SLURM-Monitor Jobs"

From ALICE Documentation

(Monitor Jobs)
Line 1: Line 1:
 
=Monitor Jobs=
 
=Monitor Jobs=
To monitor the status of your jobs in the Slurm partitions, use the squeue command.  You will only have access to see your queued jobs.  Options to this command will help filter and format the output to meet your needs.  See the man page for more information.
+
To monitor the status of your jobs in the Slurm partitions, use the '''squeue''' command.  You will only have access to see your queued jobs.  Options to this command will help filter and format the output to meet your needs.  See the man page for more information.
 
{| class="wikitable" style="width: auto !important;"  
 
{| class="wikitable" style="width: auto !important;"  
 
!Squeue Option
 
!Squeue Option
Line 17: Line 17:
 
Here is an example of using squeue.
 
Here is an example of using squeue.
 
  <nowiki>
 
  <nowiki>
        [login1 ~]$ squeue
+
        [login1 ~]$ squeue
            JOBID PARTITION      NAME    USER ST      TIME  NODES NODELIST(REASON)
+
            JOBID PARTITION      NAME    USER ST      TIME  NODES NODELIST(REASON)
              537    quick  helloWor    user  R      0:47      2 noide[004,010]
+
              537    quick  helloWor    user  R      0:47      2 noide[004,010]
      </nowiki>
+
        </nowiki>
 
   
 
   
 
The output of squeue provides the following information:
 
The output of squeue provides the following information:

Revision as of 12:20, 8 April 2020

Monitor Jobs

To monitor the status of your jobs in the Slurm partitions, use the squeue command. You will only have access to see your queued jobs. Options to this command will help filter and format the output to meet your needs. See the man page for more information.

Squeue Option Action
  ---user=<username> Lists entries only belonging to username, only available to administrator
  ---jobs=<job_id> List entry, if any, for job_id
  ---partition=<partition_name> Lists entries only belonging to partition_name

Here is an example of using squeue.

         [login1 ~]$ squeue
             JOBID PARTITION      NAME     USER ST      TIME  NODES NODELIST(REASON)
               537     quick  helloWor     user  R      0:47      2 noide[004,010]
        

The output of squeue provides the following information:

Squeue Output Column Header Definition
JOBID Unique number assigned to each job
PARTITION Partition id the job is scheduled to run or is running, on
NAME Name of the job, typically the job script name
USER User id of the job
ST Current state of the job (see table below for meaning)
TIME Amount of time job has been running
NODES Number of nodes job is scheduled to run across
NODELIST(REASON) If running, the list of the nodes the job is running on. If pending, the reason the job is waiting