Actions

Best Practices - Shared File System

From ALICE Documentation

Best Practices - Shared File System

Your I/O activity can have dramatic effects on the peformance of you jobs and on other users.  The general statement here is to ask for advice on improving your I/O activity if you are uncertain.  The time spent can be saved many times over in faster job execution.

  1. Be aware of I/O load. If your workflow creates a lot of I/O activity then creating dozens of jobs doing the same thing may be detrimental.
  2. Avoid storing many files in a single directory. Hundreds of files is probably ok; tens of thousands is not.
  3. Avoid opening and closing files repeatedly in tight loops.  If possible, open files once at the beginning of your workflow / program, then close them at the end.
  4. Watch your quotas.  You are limited in capacity and file count. Use "uquota". In /home the scheduler writes files in a hidden directory assigned to you.
  5. Avoid frequent snapshot files which can stress the storage.
  6. Limit file copy sessions. You share the bandwidth with others.  Two or three scp sessions are probably ok; >10 is not.
  7. Consolidate files. If you are transferring many small files consider collecting them in a tarball first.
  8. Use parallel I/O if available like "module load phdf5"
  9. Use local storage for working space. Us the local storage on each node for you data. This will improve the performance of your job and reduce I/O load on the shared file systems.