Actions

Difference between revisions of "When will my job start?"

From ALICE Documentation

 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
===When will my job start?===
 
===When will my job start?===
  
In practice it’s impossible to predict when your job(s) will start, since most currently running jobs will finish before their requested walltime expires, and new jobs by may be submitted by other users that are assigned a higher priority than your job(s).
 
 
{{:Job schduling}}
 
 
ALICE  uses a fair-share scheduling policy(see [[Policies]]). There is no guarantee on when a job will start, since it depends on a number of factors. One of these factors is the priority of the job, which is determined by  
 
ALICE  uses a fair-share scheduling policy(see [[Policies]]). There is no guarantee on when a job will start, since it depends on a number of factors. One of these factors is the priority of the job, which is determined by  
* historical use: the aim is to balance usage over users, so infrequent (in terms of total compute time used) users get a higher priority
 
* requested resources (amount of cores, walltime, memory, ...)
 
* time waiting in queue: queued jobs get a higher priority over time
 
* user limits: this avoids having a single user use the entire cluster. This means that each user can only use a part of the cluster.
 
 
Some other factors are how busy the cluster is, how many workernodes are active, the resources (e.g., number of cores, memory) provided by each workernode, ...
 
  
 +
{{:Job scheduling}}
 
It might be beneficial to request less resources (e.g., not requesting all cores in a workernode), since the scheduler often finds a “gap” to fit the job into more easily.
 
It might be beneficial to request less resources (e.g., not requesting all cores in a workernode), since the scheduler often finds a “gap” to fit the job into more easily.

Latest revision as of 12:46, 20 April 2020

When will my job start?

ALICE uses a fair-share scheduling policy(see Policies). There is no guarantee on when a job will start, since it depends on a number of factors. One of these factors is the priority of the job, which is determined by

  • All our resources use a fair-share scheduling policy.
  • No guarantees on when job will start, so plan ahead!
  • Job priority is determined by:
    • historical use: the aim is to balance usage over users, so infrequent (in terms of total compute time used) users get a higher priority
    • aim is to balance usage over users
    • requested resources (# nodes, Jobs and walltime, )
      • larger resource request => lower priority
    • time waiting in queue
      • queued jobs get higher priority over time
    • user limits
      • avoid that a single user fills up an entire cluster
Partitions/queues
Partition MaxNodesPerUser MaxJobsinQueue (running and waiting) MaxTime
cpu-short no-limit 100 2 hours
cpu-medium 18 30 1 day
cpu-long 15 30 7 days
gpu-short no-limit 100 2 hours
gpu-medium 8 16 1 day
gpu-long 7 14 7 days
mem 15 30 7 days
test no-limit 4 1 hour
playground-cpu no-limit 1 infinite
playground-gpu no-limit 1 infinite
notebook-cpu no-limit 1 infinite
notebook-gpu no-limit 1 infinite

It might be beneficial to request less resources (e.g., not requesting all cores in a workernode), since the scheduler often finds a “gap” to fit the job into more easily.