Actions

Difference between revisions of "SLURM-Job in Queue"

From ALICE Documentation

Line 15: Line 15:
 
   ACCOUNT|TRES_PER_NODE|MIN_CPUS|MIN_TMP_DISK|END_TIME|FEATURES|GROUP|OVER_SUBSCRIBE|JOBID|NAME|COMMENT|TIME_LIMIT|MIN_MEMORY|REQ_NODES|COMMAND|PRIORITY|QOS|REASON||ST|USER|RESERVATION|WCKEY|EXC_NODES|NICE|S:C:T|JOBID|EXEC_HOST|CPUS|NODES|DEPENDENCY|ARRAY_JOB_ID|GROUP|SOCKETS_PER_NODE|CORES_PER_SOCKET|THREADS_PER_CORE|ARRAY_TASK_ID|TIME_LEFT|TIME|NODELIST|CONTIGUOUS|PARTITION|PRIORITY|NODELIST(REASON)|START_TIME|STATE|UID|SUBMIT_TIME|LICENSES|CORE_SPEC|SCHEDNODES|WORK_DIR
 
   ACCOUNT|TRES_PER_NODE|MIN_CPUS|MIN_TMP_DISK|END_TIME|FEATURES|GROUP|OVER_SUBSCRIBE|JOBID|NAME|COMMENT|TIME_LIMIT|MIN_MEMORY|REQ_NODES|COMMAND|PRIORITY|QOS|REASON||ST|USER|RESERVATION|WCKEY|EXC_NODES|NICE|S:C:T|JOBID|EXEC_HOST|CPUS|NODES|DEPENDENCY|ARRAY_JOB_ID|GROUP|SOCKETS_PER_NODE|CORES_PER_SOCKET|THREADS_PER_CORE|ARRAY_TASK_ID|TIME_LEFT|TIME|NODELIST|CONTIGUOUS|PARTITION|PRIORITY|NODELIST(REASON)|START_TIME|STATE|UID|SUBMIT_TIME|LICENSES|CORE_SPEC|SCHEDNODES|WORK_DIR
 
   bio|N/A|1|0|2020-07-02T12:57:00|(null)|bio|OK|24791|Omma_R_test|(null)|7-00:00:00|0||/data/vissermcde/Ommatotriton/Konstantinos_dataset/run_R.sh|0.00010384921918|normal|Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions||PD|vissermcde|(null)|(null)||0|*:*:*|24791|n/a|1|1||24791|1491|*|*|*|N/A|7-00:00:00|0:00||0|cpu-long|446029|(Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)|2020-06-25T12:57:00|PENDING|1585|2020-06-24T12:32:22|(null)|N/A|node010|/data/vissermcde/Ommatotriton/Konstantinos_dataset
 
   bio|N/A|1|0|2020-07-02T12:57:00|(null)|bio|OK|24791|Omma_R_test|(null)|7-00:00:00|0||/data/vissermcde/Ommatotriton/Konstantinos_dataset/run_R.sh|0.00010384921918|normal|Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions||PD|vissermcde|(null)|(null)||0|*:*:*|24791|n/a|1|1||24791|1491|*|*|*|N/A|7-00:00:00|0:00||0|cpu-long|446029|(Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)|2020-06-25T12:57:00|PENDING|1585|2020-06-24T12:32:22|(null)|N/A|node010|/data/vissermcde/Ommatotriton/Konstantinos_dataset
From thsi you read that this job is geplaned to execute on node010 (SCHEDNODES) and that this job will start at or earlier than 2020-06-25T12:57:00 (START_TIME).
+
From above you read that this job is planed to execute on node010 (SCHEDNODES) and that this job will start at or earlier than 2020-06-25T12:57:00 (START_TIME).
  
 
One can also print just those two items:
 
One can also print just those two items:

Revision as of 14:14, 29 June 2020

Job in Queue

Sometimes a long queue time is an indication that something is wrong or the cluster could simply be busy. You can check to see how much longer your job will be in the queue with the command:

squeue --start --job <job_id>

Please note that this is only an estimate based on current and historical utilization and results can fluctuate. Here is an example of using squeue with the start and job options.

[me@nodelogin01~]$ squeue --start --job 384
 JOBID PARTITION 	NAME 	USER ST          START_TIME  NODES SCHEDNODES   NODELIST(REASON)

  384      main    star-lac     user PD 2018-02-12T16:09:31  	 2 (null)       (Resources)

In the above example, the job is in a pending to run state, because there are no resources available that will allow it to launch. The job is expected to start at approximately 16:09:31 on 02-12-2018. This is an estimation, as jobs ahead of it may complete sooner, freeing up necessary resources for this job. If you believe there is a problem with your job starting, and have checked your scripts for typos, send email to helpdesk@alice.leidenuniv.nl. Let us know your job ID along with a description of your problem and we can check to see if anything is wrong.

squeue has extended functionality which can be of use if you are wondering about the place your jobs has in the waiting list. There are lost of options available:

 # squeue -p cpu-long -o %all
 ACCOUNT|TRES_PER_NODE|MIN_CPUS|MIN_TMP_DISK|END_TIME|FEATURES|GROUP|OVER_SUBSCRIBE|JOBID|NAME|COMMENT|TIME_LIMIT|MIN_MEMORY|REQ_NODES|COMMAND|PRIORITY|QOS|REASON||ST|USER|RESERVATION|WCKEY|EXC_NODES|NICE|S:C:T|JOBID|EXEC_HOST|CPUS|NODES|DEPENDENCY|ARRAY_JOB_ID|GROUP|SOCKETS_PER_NODE|CORES_PER_SOCKET|THREADS_PER_CORE|ARRAY_TASK_ID|TIME_LEFT|TIME|NODELIST|CONTIGUOUS|PARTITION|PRIORITY|NODELIST(REASON)|START_TIME|STATE|UID|SUBMIT_TIME|LICENSES|CORE_SPEC|SCHEDNODES|WORK_DIR
 bio|N/A|1|0|2020-07-02T12:57:00|(null)|bio|OK|24791|Omma_R_test|(null)|7-00:00:00|0||/data/vissermcde/Ommatotriton/Konstantinos_dataset/run_R.sh|0.00010384921918|normal|Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions||PD|vissermcde|(null)|(null)||0|*:*:*|24791|n/a|1|1||24791|1491|*|*|*|N/A|7-00:00:00|0:00||0|cpu-long|446029|(Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)|2020-06-25T12:57:00|PENDING|1585|2020-06-24T12:32:22|(null)|N/A|node010|/data/vissermcde/Ommatotriton/Konstantinos_dataset

From above you read that this job is planed to execute on node010 (SCHEDNODES) and that this job will start at or earlier than 2020-06-25T12:57:00 (START_TIME).

One can also print just those two items:

 # squeue -p cpu-long -o "%u|%S"
 USER|START_TIME
 vissermcde|2020-06-25T12:57:00