From ALICE Documentation
File systems on ALICE
|File system||Name||Quota||Speed||Shared between nodes||Expiration||Backup|
|Home||/home||100 GB||Normal||Yes||None||Nightly incremental|
|Scratch||/scratchdata||10 TB||Fast||No||End of job||No|
|Scratch-shared||/data||(N.A. 57 TB)||Normal||Yes||At most 28 days||No|
|Software||/cm/shared||(N.A. read-only)||Normal||Yes||None||Nightly Incremental|
The home file system
The home file system contains the files you normally use. By default, you have a quotum of 100 GB. Your current usage is shown when you type in the command:
The home file system is a network file system (NFS) that is available on all login and compute nodes. Thus, your jobs can access the home file system from all nodes. The downside is that the home file system is not particularly fast, especially with the handling of meta data: creating and destroying of files; opening and closing of files; many small updates to files and so on.
Backup & restore
- We do nightly incremental backups.
- Files that are open at the time of backup will be skipped.
- We can restore files and/or directories when you accidentally remove them up to 15 days back, provided they already existed during the last successful backup.
The scratch file system
The scratch file system is intended as fast, temporary storage that can be used while running a job. Every compute node in the Lisa system contains a local disk for the scratch file system that can only be accessed by that particular node. There is no quotum for the scratch file system; use of the scratch file system is eventually limited by the capacity of these disks (see the description of the ALICE system). Scratch disks are not backed up and are cleaned at the end of a job.
Since the disks are local, read and write operations on to the scratch file system are much faster than on the home file system. This makes the scratch file system very suitable for I/O intensive operations.
How to best use scratch
In general, the best way to use scratch is to copy your input files from your home or data to scratch at the start of a job, create all temporary files needed by your job on scratch (assuming they don't need to be shared with other nodes) and copy all output files at the end of a job back to the home file system. There are two things to note:
- A directory will be created for you upon start of a job on a compute node. The directory name is
SLURM_JOB_USERis your ALICE username and
- Don't forget to copy your results back to the home or data file system! Scratch will be cleaned and the directory will be removed after your job finishes and your results will be lost if you forget this step.
In addition to temporary storage that is local to each node (like scratch), you will need some temporary storage that is shared among nodes. For this we have a shared scratch disk accessible through
The size of this shared scratch space is currently 1 TB and there is no quotum for individual users. Note that this shared scratch has two disadvantages compared to the local scratch disk
- The speed of
/datais similar to the home file system and thus slower than the local scratch disk at
- You share
/datawith all other users and there may not be enough space to write all the files you want. Thus, carefully think how your job will behave if it tries to write to
/dataand there is insufficient space: it would be a waste of budget if the results of a long computation are lost because of it.
Software file system
- /cm/shared - This mount provides a consistent set of binaries for the entire cluster.
Each worker node has multiple filesystem mounts.
- /dev/shm - On each worker you may also create a virtual filesystem directly into memory, for extremely fast data access. It is the fastest available filesystem, but be advised that this will count against the memory used for your job. The maximum size is set to half the physical RAM size of the worker node.