Actions

Difference between revisions of "Best Practices - Submitting Jobs"

From ALICE Documentation

(Best practices)
 
Line 3: Line 3:
 
# '''Test your submission scripts.'''  Start small.  You can use the debug queue which has a higher priority but a short run time.
 
# '''Test your submission scripts.'''  Start small.  You can use the debug queue which has a higher priority but a short run time.
 
# '''Respect memory limits.'''  If your application needs more memory than is available, your job could fail and leave the node in a state that requires manual intervention.
 
# '''Respect memory limits.'''  If your application needs more memory than is available, your job could fail and leave the node in a state that requires manual intervention.
# '''Use the debug queue.'''  It has a higher priority which is useful for running tests that can complete in less than 10 minutes.
+
# '''Use the testing queue.'''  It has a higher priority which is useful for running tests that can complete in less than 10 minutes.
 
# '''Do not run scripts automating job submissions.''' Executing large numbers of sbatch's in rapid succession can overload the system's scheduler leading to problems with overall system performance. A better alternative is to submit job arrays.
 
# '''Do not run scripts automating job submissions.''' Executing large numbers of sbatch's in rapid succession can overload the system's scheduler leading to problems with overall system performance. A better alternative is to submit job arrays.

Latest revision as of 14:15, 29 June 2020

Best practices

  1. Don't ask for more time than you really need.  The scheduler will have an easier time finding a slot for the 2 hours you need rather than the 48 hours you request.  When you run a job it will report back on the time used which you can use as a reference for future jobs.  However, don't cut the time too tight.  If something like shared I/O activity slows it down and you run out of time, the job will fail.
  2. Test your submission scripts.  Start small.  You can use the debug queue which has a higher priority but a short run time.
  3. Respect memory limits.  If your application needs more memory than is available, your job could fail and leave the node in a state that requires manual intervention.
  4. Use the testing queue.  It has a higher priority which is useful for running tests that can complete in less than 10 minutes.
  5. Do not run scripts automating job submissions. Executing large numbers of sbatch's in rapid succession can overload the system's scheduler leading to problems with overall system performance. A better alternative is to submit job arrays.