Difference between revisions of "Current Status Overview"

From ALICE Documentation

Line 1: Line 1:
[[Category:About ALICE]]
[[Category:About ALICE]]
==== ALICE node status ====
{{:ALICE node status}}
<pre style="color: green;">All nodes are up and running.</pre>

Revision as of 19:24, 28 October 2020

ALICE node status

Gateway: OK
Login nodes: OK
CPU nodes: OK
GPU nodes: OK
High-memory nodes: OK
Storage: OK

Current Issues

  • Copying data to the shared scratch via sftp:
    • There is currently an issue on the sftp gateway which does prevents users from copying data to their shared scratch directory, i.e., /home/<username>/data
    • A current work-around is to use scp or sftp via the ssh gateway and the login nodes.
    • Status: Work in Progress
    • Last Updated: 19 Apr 2021, 12:17 CET
  • Slurm issue with ssh to compute nodes when more than one job is running:
    • The current slurm version has a bug which prevents users from logging into the compute node on which their job is running if two or more jobs are running on the node. We are looking into this.
    • If you try to log into a node which has more than job running you will see this error message: "Access denied by pam_slurm_adopt: you have no active jobs on this node Authentication failed."
    • If your job is the only one running on the node, ssh to the node should work without a problem.

See here for other recently solved issues: Solved Issues

ALICE usage statistics past 4 hours

  • Cluster Load

  • Number of running processes