Actions

Difference between revisions of "When will my job start?"

From ALICE Documentation

Line 4: Line 4:
  
 
ALICE  uses a fair-share scheduling policy(see [[Policies]]). There is no guarantee on when a job will start, since it depends on a number of factors. One of these factors is the priority of the job, which is determined by  
 
ALICE  uses a fair-share scheduling policy(see [[Policies]]). There is no guarantee on when a job will start, since it depends on a number of factors. One of these factors is the priority of the job, which is determined by  
* historical use: the aim is to balance usage over users, so infrequent (in terms of total compute time used) users get a higher priority requested resources (amount of cores, walltime, memory, ...)
+
* historical use: the aim is to balance usage over users, so infrequent (in terms of total compute time used) users get a higher priority  
 +
* requested resources (amount of cores, walltime, memory, ...)
 
* time waiting in queue: queued jobs get a higher priority over time
 
* time waiting in queue: queued jobs get a higher priority over time
 
* user limits: this avoids having a single user use the entire cluster. This means that each user can only use a part of the cluster.
 
* user limits: this avoids having a single user use the entire cluster. This means that each user can only use a part of the cluster.

Revision as of 12:16, 20 April 2020

When will my job start?

In practice it’s impossible to predict when your job(s) will start, since most currently running jobs will finish before their requested walltime expires, and new jobs by may be submitted by other users that are assigned a higher priority than your job(s).

ALICE uses a fair-share scheduling policy(see Policies). There is no guarantee on when a job will start, since it depends on a number of factors. One of these factors is the priority of the job, which is determined by

  • historical use: the aim is to balance usage over users, so infrequent (in terms of total compute time used) users get a higher priority
  • requested resources (amount of cores, walltime, memory, ...)
  • time waiting in queue: queued jobs get a higher priority over time
  • user limits: this avoids having a single user use the entire cluster. This means that each user can only use a part of the cluster.

Some other factors are how busy the cluster is, how many workernodes are active, the resources (e.g., number of cores, memory) provided by each workernode, ...

It might be beneficial to request less resources (e.g., not requesting all cores in a workernode), since the scheduler often finds a “gap” to fit the job into more easily.