Data storage

From ALICE Documentation

Data storage

File and I/O Management

The file system is one of the critical components to the service and users should aim to make the best use of the resources. This chapter details the differing file systems in use, the best practices for I/O, and basic housekeeping of your files.

Best Practices

ALICE has a dedicated data transfer node,, that is described on the ALICE Data Transfer Server page. For the fastest possible data transfer rates use, and transfer files directly to your /data directory. Keep in mind that data on /data is not backed up, so it is good practice to save a copy of the data to your project directory (assuming its file size is smaller than your project’s disk quota), where it will be backed up nightly.

Summary of available file systems

File system Directory Disk Quota Speed Shared between nodes Expiration Backup Files removed?
Home /home 15 GB Normal Yes None Nightly incremental No
Scratch /scratch data 10 TB Fast No End of job No No automatic deletion currently
Scratch-shared /data (N.A. 57 TB) Normal Yes At most 28 days No No automatic deletion currently
Cluster-wide software /cm/shared
N/A (not for user storage)
Normal Yes None Nightly Incremental N/A (not for user storage)

The home file system

The home file system contains the files you normally use. By default, you have a 15 GB. Your current usage is shown when you type in the command:

quota -s

The home file system is a network file system (NFS) that is available on all login and compute nodes. Thus, your jobs can access the home file system from all nodes. The downside is that the home file system is not particularly fast, especially with the handling of metadata: creating and destroying of files; opening and closing of files; many small updates to files and so on.

Backup & restore

  • We do nightly incremental backups.
  • Files that are open at the time of the backup will be skipped.
  • We can restore files and/or directories when you accidentally remove them up to 15 days back, provided they already existed during the last successful backup.

The scratch file system

The scratch file system is intended as fast, temporary storage that can be used while running a job. Every compute node in the ALICE system contains a local disk for the scratch file system that can only be accessed by that particular node. There is no quota for the scratch file system; use of the scratch file system is eventually limited by the capacity of these disks (see the description of the ALICE system). Scratch disks are not backed up and are cleaned at the end of a job.

Since the disks are local, read and write operations on to the scratch file system are much faster than on the home file system. This makes the scratch file system very suitable for I/O intensive operations.

How to best use scratch

In general, the best way to use scratch is to copy your input files from your home or data to scratch at the start of a job, create all temporary files needed by your job on scratch (assuming they don't need to be shared with other nodes) and copy all output files at the end of a job back to the home file system. There are two things to note:

  • A directory will be created for you upon the start of a job on a compute node. The directory name is /scratchdata/${SLURM_JOB_USER}/${SLURM_JOB_ID} where SLURM_JOB_USER is your ALICE username and SLURM_JOB_ID.
  • Don't forget to copy your results back to the home or data file system! Scratch will be cleaned and the directory will be removed after your job finishes and your results will be lost if you forget this step.

The scratch-shared file system

In addition to temporary storage that is local to each node (like scratch), you will need some temporary storage that is shared among nodes. For this, we have a shared scratch disk accessible through

cd /data or cd ~/data

The size of this shared scratch space is currently 1 TB and there is no quota for individual users. Note that this shared scratch has two disadvantages compared to the local scratch disk

  • The speed of /data is similar to the home file system and thus slower than the local scratch disk at /scratchdata/${SLURM_JOB_USER}/${SLURM_JOB_ID}.
  • You share /data with all other users and there may not be enough space to write all the files you want. Thus, carefully think how your job will behave if it tries to write to /data and there is insufficient space: it would be a waste of budget if the results of a long computation are lost because of it.

Software file system

  • /cm/shared - This mount provides a consistent set of binaries for the entire cluster.

Compute Local

Each worker node has multiple file system mounts.

  • /dev/shm - On each worker, you may also create a virtual file system directly into memory, for extremely fast data access. It is the fastest available file system, but be advised that this will count against the memory used for your job. The maximum size is set to half the physical RAM size of the worker node.