Actions

SLURM-Partition

From ALICE Documentation

Revision as of 15:15, 9 September 2020 by Schulzrf (talk | contribs) (Partition)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Partition

Slurm organises the resources in a cluster in so-called partitions and jobs are always submitted to either a default partition or a user-specified partition.

The command sinfo lists the available partitions, their state and resources. Its output might look like this:

 [me@nodelogin02]$ sinfo
 PARTITION      AVAIL  TIMELIMIT  NODES  STATE NODELIST
 testing           up    1:00:00      2   idle nodelogin[01-02]
 cpu-short*        up    3:00:00     11    mix node[002-007,013-014,018-020]
 cpu-short*        up    3:00:00      1  alloc node001
 cpu-short*        up    3:00:00      8   idle node[008-012,015-017]
 cpu-medium        up 1-00:00:00     11    mix node[002-007,013-014,018-020]
 cpu-medium        up 1-00:00:00      1  alloc node001
 cpu-medium        up 1-00:00:00      8   idle node[008-012,015-017]
 cpu-long          up 7-00:00:00     11    mix node[002-007,013-014,018-020]
 cpu-long          up 7-00:00:00      1  alloc node001
 cpu-long          up 7-00:00:00      8   idle node[008-012,015-017]
 gpu-short         up    3:00:00     10    mix node[851-860]
 gpu-medium        up 1-00:00:00     10    mix node[851-860]
 gpu-long          up 7-00:00:00     10    mix node[851-860]
 mem               up   infinite      1   idle node801
 notebook-cpu      up   infinite      4    mix node[002-005]
 notebook-cpu      up   infinite      1  alloc node001
 notebook-gpu      up   infinite      2    mix node[851-852]
 playground-cpu    up 7-00:00:00      4    mix node[002-005]
 playground-cpu    up 7-00:00:00      1  alloc node001
 playground-gpu    up 7-00:00:00      2    mix node[851-852]

Currently, partitions on ALICE differ primarily in terms of the available nodes and time limit for a job:

Partition Timelimit Nodes Nodelist Description
testing 1:00:00 2 nodelogin[01-02] For some basic and short testing of batch scripts
cpu-short 3:00:00 20 nodes[001-020] For jobs that require CPU nodes and not more than 3h of running time. This is the default partition
cpu-medium 1-00:00:00 20 nodes[001-020] For jobs that require CPU nodes and not more than 1d of running time
cpu-long 7-00:00:00 20 nodes[001-020] For jobs that require CPU nodes and not more than 7d of running time
gpu-short 3:00:00 10 nodes[851-860] For jobs that require GPU nodes and not more than 3h of running time
gpu-medium 1-00:00:00 10 nodes[851-860] For jobs that require GPU nodes and not more than 1d of running time
gpu-long 7-00:00:00 10 nodes[851-860] For jobs that require GPU nodes and not more than 7d of running time
mem inifinite 1 nodes801 For jobs that require the high memory node. There is no time limit for this partition
notebook-cpu infinite 5 nodes[001-005] For interactive jobs that require CPU nodes. There is no time limit for this partition
notebook-gpu infinite 2 nodes[851-852] For interactive jobs that require GPU nodes. There is no time limit for this partition

In your batch script, you can use the following command to set the partition you need:

--partition=<partition-name>