Actions

Solved Issues

From ALICE Documentation

Solved issues

This page contains recently solved issues. After some time, entries will be removed.

List of recently solved issues

  • No access to ALICE - SSH gateway failure:
    • The ssh gateway is currently not working.
    • Access to ALICE is not possible. The cluster itself is not affected and processing continues.
    • The gateway is working again. Access is possible
    • Status: SOLVED
    • Last Updated: 02 Jun 2022, 19:45 CET
  • Node015 out of service:
    • Node015 is out of service because of technical issues. We are in contact with our vendor.
    • The issue has been identified to be a broken CPU. Replacement is under way and will integrated very soon.
    • The replacement CPU has been integrated and node015 is working again.
    • Status: SOLVED
    • Last Updated: 01 Feb 2022, 13:28 CET
  • Infiniband network down:
    • Due to an issue on the Infiniband switch, the Infiniband network is currently down and out-of-service.
    • The Infiniband switch is being repaired.
    • We have replaced the broken switch and the new one is working. The Infiniband network is available again.
    • Status: SOLVED
    • Last Updated: 08 Oct 2021, 13:54 CEST
  • Slurm issue with ssh to compute nodes when more than one job is running:
    • The current slurm version has a bug which prevents users from logging into the compute node on which their job is running if two or more jobs are running on the node. We are looking into this.
    • If you try to log into a node which has more than job running you will see this error message: "Access denied by pam_slurm_adopt: you have no active jobs on this node Authentication failed."
    • If your job is the only one running on the node, ssh to the node should work without a problem.
    • The update to slurm 20.11.7 solved this issue.
    • Status: SOLVED
    • Last Update: 21 Jul 2021, 15:34 CEST
  • SSH Keys on new Gateway:
    • We have received multiple reports that ssh keys are not working properly on the new gateway because of bad permissions. This issue seems to affect some users but not all. We are looking into it.
    • The permissions in the home directories should be fixed now. Please log in and log out if you have been logged so far. Should you still encounter issues, please contact the ALICE Helpdesk
    • Status: SOLVED
    • Last Updated: 27 May 2021, 09:33 CEST
  • Logging in to ALICE ssh gateway:
    • We are experiencing issues with logging in to the ALICE gateway. We are looking into it.
    • It is very likely that you are unable to login. You might be prompted for a password even though you have set up ssh keys and your correct password is rejected.
    • We are deploying a new gateway and tests indicate that it is working properly. We are working on completing the setup so that existing keys continue to work. Once this has been verified, we will switch to the new server.
    • The issue with connecting to the ALICE ssh gateway has been resolved. A new gateway has been deployed and all keys were transferred to the new gateway
    • We have also changed the domain to point to the IP of the new server. The domain should be resolved properly by now, i.e., it should use the IP of the new server. In case it is not and you still get connected to the old gateway, you can either wait a bit longer or if you are in a hurry, you can replace the domain "ssh-gw.alice.universiteitleiden.nl" by the IP of the new gateway: 132.229.92.133.
    • The new gateway was temporarily not available from outside the University Leiden network. This has been resolved.
    • Status: SOLVED.
    • Last Updated: 26 May 2021, 17:00 CEST
  • E-Mail notifications not always working:
    • We discovered an issue with e-mail notificaitions from ALICE. It seems that sometimes e-mails a not delivered to the recipient. However, most notifications are still being send properly.
    • E-mail notifications should work again properly. If you still notice issues, please contact the ALICE Helpdesk.
    • Status: Solved
    • Last Updated: 19 Apr 2021, 12:17 CET


  • SSH connection breaking up after a few minutes
    • We have received several reports that since last week ssh connections to ALICE are getting closed after a few minutes of being idle. This has not been the case before the 1 Feb.
    • Changes to the ssh gateway require the client to keep SSH connection alive. This can be achieved by using the ServerAliveInterval setting (e.g., "ServerAliveInterval 60") in your ssh config settings for ALICE.
    • Status: Solved
    • Last Updated: 16 Mar 2021, 12:48 CET