Actions

Running jobs on ALICE

From ALICE Documentation

Revision as of 13:16, 1 September 2020 by Schulzrf (talk | contribs) (Created page with "Category:User guides {{:Running_a_job_on_ALICE_using_Slurm}} {{:SLURM-Requesting Job Resources}} =Slurm basics= {{:SLURM-Common Slurm Commands}} {{:SLURM-Environment Varia...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Running a job on ALICE using Slurm

The ALICE cluster uses Slurm (Simple Linux Utility for Resource Management) for job scheduling. Slurm is an open-source job scheduler that allocates compute resources on clusters for jobs. Slurm has been deployed at various national and international computing centres, and by approximately 60% of the TOP500 supercomputers in the world.

The following pages will give you a basic overview of Slurm on ALICE. You can learn much more about Slurm and its commands from the official Slurm website.

To use Slurm commands, you must first log in to ALICE. For information on how to login to the ALICE long nodes see section Login to cluster.

Requesting job resources

ATTENTION: We recommend that you submit sbatch Slurm jobs with the #sbatch--export=none option to establish a clean environment, otherwise Slurm will propagate current environmental variables to the job. This could impact the behavior of the job, particularly for MPI jobs.

In order to use the HPC Slurm compute nodes, you must first login to a head node, hpc-login3 or hpc-login2, and submit a job.

  • To request an interactive job, use the salloc command.
  • To submit a job script, use the sbatch command.
  • To check on the status of a job already in the Slurm queue, use the squeue and sinfo commands.

Slurm basics

Common Slurm commands

The following is a list of common Slurm commands that will be discussed in more detail in this chapter and the following ones.

Command Definition
sbatch Submit a job script for execution (queued)
scancel Delete a job
scontrol Job status (detailed), several options only available to root
sinfo Display state of partitions and nodes
squeue Display state of all (queued) jobs
salloc Submit a job for execution or initiate job in real-time (interactive job)

If you want to get a full overview, have a look at the Slurm documentation or enter man <command> while logged into the ALICE.

Environment variables

Any environment variables that you have set with the sbatch command will be passed to your job. For this reason, if your program needs certain environment variables set to function properly, it is best to put them in your job script. This also makes it easier to reproduce your job results later, if necessary.

In addition to setting environment variables yourself, Slurm provides some environment variables of its own that you can use in your job scripts. Information on some of the common slurm environment variables is listed in the chart below. For additional information, see the man page for sbatch.

Environmental Variable Definition
$SLURM_JOB_ID ID of job allocation
$SLURM_SUBMIT_DIR Directory job where was submitted
$SLURM_JOB_NODELIST File containing allocated hostnames
$SLURM_NTASKS Total number of cores for job

NOTE: Environment variables override any options set in a batch script. Command-line options override any previously set environment variables.

Specifing resources for jobs

Slurm has its syntax to request compute resources. Below is a summary table of some commonly requested resources and the Slurm syntax to get it. For a complete listing of request syntax, run the command man sbatch.

Syntax Meaning
sbatch/salloc Submit batch/interactive job
   --ntasks=<number> Number of processes to run (default is 1)
   --time=<hh:mm:ss> The walltime or running time of your job (default is 00:30:00)
  --mem=<number> Total memory (single node)
  --mem-per-cpu=<number> Memory per processor core
  --constraint=<attribute> Node property to request (e.g. avx, IB)
  --partition=<partition_name> Request specified partition/queue

For more details on Slurm syntax, see below or the Slurm documentation at slurm.schedmd.com/sbatch.html

Determining what resources to request

Requesting the right amount of resources for jobs is one the most essential aspects of using Slurm (or running any jobs on an HPC).

Before you submit a job for batch processing, it is important to know what the requirements of your program are so that it can run properly. Each program and workflow has unique requirements so we advise that you determine what resources you need before you write your script.

Keep in mind that increasing the amount of compute resources may also increase the amount of time that your job spends waiting in the queue. Within some limits, you may request whatever resources you need but bear in mind that other researchers need to be able to use those resources as well.

It is vital that you specify the resources you need as detailed as possible. This will help Slurm to better schedule your job and to allocate free resources to other users.

Below are some ways to specify the resources to ask for in your job script. These are options defined for the sbatch and salloc commands. There are additional options that you can find by checking the man pages for each command.

Nodes, Tasks and CPU's per task

In Slurm terminology, a task is an instance of a running a program.

If your program supports communication across computers or you plan on running independent tasks in parallel, request multiple tasks with the following command. The default value is set to 1.

--ntasks=<number>

For more advanced programs, you can request, multiple nodes, multiple tasks and multiple CPUs per task and/or per nodes.

If you need multiple nodes, then you can define the number of nodes like this

 --nodes=<number>

Memory

All programs require a certain amount of memory to function properly. To see how much memory your program needs, you can check the documentation or run it in an interactive session and use the top command to profile it. To specify the memory for your job, use the mem-per-cpu option.

--mem-per-cpu=<number>

Where <number> is memory per processor core. The default is 1GB.

Walltime

If you do not define how long your job will run, it will default to 30 minutes. The maximum walltime that is available depends on the partition that you use.

To specify the walltime for your job, use the time option.

--time=<hh:mm:ss>

Here, <hh:mm:ss> represents hours, minutes and seconds requested. If a job does not complete within the runtime specified in the script, it will terminate.

GPU's

Some programs can take advantage of the unique hardware architecture in a graphics processing unit (GPU). You have to check your documentation for compatibility. A certain number of nodes on the ALICE cluster are equipped with multiple GPUs on each of them (see the hardware description). We strongly recommend that you always specify how many GPUs you will need for your job. This way, slurm can schedule other jobs on the node which will use the remaining GPUs.

To request a node with GPUs, choose one of the gpu partitions and add one of the following lines to your script:

--gres=gpu:<number>

or

--gres=gpu:<GPU_type>:<number>

where:

  • <number> is the number of GPUs per node requested.
  • <GPU_type> is one of the following: 2080ti

Just like for using CPUs, you can specify the memory that you need on the GPU with

 --mem-per-gpu=<number>