Actions

Difference between revisions of "SLURM-GPU's"

From ALICE Documentation

(Created page with "=== GPU's === Some programs can take advantage of the unique hardware architecture in a graphics processing unit (GPU). You have to check your documentation for compatibility....")
 
(GPU's)
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
=== GPU's ===
+
== GPU's ==
Some programs can take advantage of the unique hardware architecture in a graphics processing unit (GPU). You have to check your documentation for compatibility. To request a GPU, add one of the following lines to your script:
+
Some programs can take advantage of the unique hardware architecture in a graphics processing unit (GPU). You have to check your documentation for compatibility. A certain number of nodes on the ALICE cluster are equipped with multiple GPUs on each of them (see the hardware description). We strongly recommend that you always specify how many GPUs you will need for your job. This way, slurm can schedule other jobs on the node which will use the remaining GPUs.
 +
 
 +
To request a node with GPUs, choose one of the gpu partitions and add one of the following lines to your script:
 
  --gres=gpu:<number>
 
  --gres=gpu:<number>
 
or
 
or
Line 7: Line 9:
  
 
*<number> is the number of GPUs per node requested.
 
*<number> is the number of GPUs per node requested.
*<GPU_type> is one of the following: 2080ti.
+
*<GPU_type> is one of the following: 2080ti
 
 
Use the chart below to determine which of the above GPU_type's you wish to request:
 
 
 
{| class="wikitable" style="width: auto !important;"
 
!GPU_type
 
!Max Number of GPUs Per Node
 
!GPU Model
 
|-
 
|2080ti
 
|4
 
|NVIDIA 2080TI
 
|}
 
 
 
For interactive sessions utilizing GPUs, after salloc has run and you are on a compute node, you need to use the srun command to execute your commands:
 
  
<nowiki>
+
Just like for using CPUs, you can specify the memory that you need on the GPU with
              $ nvidia-smi
+
   --mem-per-gpu=<number>
              No devices were found
 
             
 
              $ srun -n1 nvidia-smi
 
              +-----------------------------------------------------------------------------+
 
              | NVIDIA-SMI 390.25                Driver Version: 390.25                    |
 
              |-------------------------------+----------------------+----------------------|
 
              | GPU Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 
              | Fan  Temp  Perf  Pwr:Usage/Cap|        Memory-Usage | GPU-Util  Compute M. |
 
              |===============================+======================+======================|
 
              |  0  RTX 2080TI          On  | 00000000:08:00.0 Off |                    0 |
 
              | N/A   29C    P8    25W / 225W |      0MiB /  4743MiB |      0%      Default |
 
              +-------------------------------+----------------------+----------------------+
 
              </nowiki>
 
<br />
 

Latest revision as of 07:34, 2 September 2020

GPU's

Some programs can take advantage of the unique hardware architecture in a graphics processing unit (GPU). You have to check your documentation for compatibility. A certain number of nodes on the ALICE cluster are equipped with multiple GPUs on each of them (see the hardware description). We strongly recommend that you always specify how many GPUs you will need for your job. This way, slurm can schedule other jobs on the node which will use the remaining GPUs.

To request a node with GPUs, choose one of the gpu partitions and add one of the following lines to your script:

--gres=gpu:<number>

or

--gres=gpu:<GPU_type>:<number>

where:

  • <number> is the number of GPUs per node requested.
  • <GPU_type> is one of the following: 2080ti

Just like for using CPUs, you can specify the memory that you need on the GPU with

 --mem-per-gpu=<number>