Difference between revisions of "When will my job start?"
From ALICE Documentation
(Tag: Visual edit) |
(Tag: Visual edit) |
||
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
===When will my job start?=== | ===When will my job start?=== | ||
− | + | ALICE uses a fair-share scheduling policy(see [[Policies]]). There is no guarantee on when a job will start, since it depends on a number of factors. One of these factors is the priority of the job, which is determined by | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
+ | {{:Job scheduling}} | ||
It might be beneficial to request less resources (e.g., not requesting all cores in a workernode), since the scheduler often finds a “gap” to fit the job into more easily. | It might be beneficial to request less resources (e.g., not requesting all cores in a workernode), since the scheduler often finds a “gap” to fit the job into more easily. |
Latest revision as of 12:46, 20 April 2020
When will my job start?
ALICE uses a fair-share scheduling policy(see Policies). There is no guarantee on when a job will start, since it depends on a number of factors. One of these factors is the priority of the job, which is determined by
- All our resources use a fair-share scheduling policy.
- No guarantees on when job will start, so plan ahead!
- Job priority is determined by:
- historical use: the aim is to balance usage over users, so infrequent (in terms of total compute time used) users get a higher priority
- aim is to balance usage over users
- requested resources (# nodes, Jobs and walltime, )
- larger resource request => lower priority
- time waiting in queue
- queued jobs get higher priority over time
- user limits
- avoid that a single user fills up an entire cluster
Partition | MaxNodesPerUser | MaxJobsinQueue (running and waiting) | MaxTime |
---|---|---|---|
cpu-short | no-limit | 100 | 2 hours |
cpu-medium | 18 | 30 | 1 day |
cpu-long | 15 | 30 | 7 days |
gpu-short | no-limit | 100 | 2 hours |
gpu-medium | 8 | 16 | 1 day |
gpu-long | 7 | 14 | 7 days |
mem | 15 | 30 | 7 days |
test | no-limit | 4 | 1 hour |
playground-cpu | no-limit | 1 | infinite |
playground-gpu | no-limit | 1 | infinite |
notebook-cpu | no-limit | 1 | infinite |
notebook-gpu | no-limit | 1 | infinite |
It might be beneficial to request less resources (e.g., not requesting all cores in a workernode), since the scheduler often finds a “gap” to fit the job into more easily.