Current Status Overview

From ALICE Documentation

Revision as of 19:24, 28 October 2020 by Schulzrf (talk | contribs)

ALICE node status

Login nodes: OK
CPU nodes: OK
GPU nodes: OK
High-memory nodes: OK

Current Issues

  • Slurm issue with ssh to compute nodes when more than one job is running:
    • The current slurm version has a bug which prevents users from logging into the compute node on which their job is running if two or more jobs are running on the node. We are looking into this.
    • If you try to log into a node which has more than job running you will see this error message: "Access denied by pam_slurm_adopt: you have no active jobs on this node Authentication failed."
    • If your job is the only one running on the node, ssh to the node should work without a problem.
  • E-Mail notifications not always working:
    • We discovered an issue with e-mail notificaitions from ALICE. It seems that sometimes e-mails a not delivered to the recipient. However, most notifications are still being send properly.
    • E-mail notifications should work again properly. If you still notice issues, please contact the ALICE Helpdesk.
    • Status: Solved
    • Last Updated: 19 Apr 2021, 12:17 CET

See here for other recently solved issues: Solved Issues

ALICE usage statistics past 4 hours

  • Cluster Load

  • Number of running processes