Actions

ALICE User Documentation Wiki

From ALICE Documentation

(Redirected from Main Page)
Off to research computing Wonderland


Welcome to the ALICE HPC user documentation.

ALICE is a computing facility for research and education of Leiden University. With ALICE you have the world of computing at your fingertips. On this wiki, you find the information you need to get started and become more skilled in using a compute cluster for research and education.

We appreciate any questions and comments on the content of the documentation so that we can improve the information that we supply here.

If you are unsure about where to go next, have a look below.

What is ALICE?

The About ALICE pages give some background information, a quick overview and how to acknowledge ALICE in your publications.

How can I get an account?

The page Getting an Account explains how to request an account on ALICE.

What's new with ALICE?

To get information about updates, upgrades, events, planned maintenance and more, have a look at the News page.

Here is the most recent news:

Latest News

  • 21 Sep 2022 - Access to ALICE: On 26 Sept 2022 between 18:00 and 18:30, access to ALICE will not be possible due to maintenance on the University cloud platform.
  • 24 Aug 2022 - ALICE available again: Maintenance on ALICE is over. The cluster is online again and available to all users. We apologize for the delay.
  • 23 Aug 2022 - ALICE system maintenance not finished and continues tomorrow: We managed to solve many of the issues that we faced yesterday. We are waiting for the completion of synchronization processes which are part of the high-availability setup procedure. If all goes well, we just need to run a few tests to verify that the new high-availability setup is working properly and all the nodes are coming back. Unfortunately, it was not possible to do today anymore. In case the setup fails after all, we are prepared to revert back all the changes and bring ALICE online again. In any case, we expect ALICE to be online again sometime tomorrow afternoon. We are sorry for the delay, but the new high-availability setup is vital for ALICE which is why have been working hard to get it done.
  • 22 Aug 2022 - ALICE is offline due to system maintenance - Continues tomorrow: We encountered unexpected technical issues during our highest priority task for this maintenance day, the high-availability setup. Because this is a critical component for the continuing stability of ALICE and we require the cluster to be offline, we decided to continue solving the issues tomorrow and keep the cluster offline.
  • 17 Aug 2022 - REMINDER - ALICE system maintenance on 22 Aug 2022: We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. The login nodes will be rebooted and all active terminal or X2Go sessions will be terminated. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
  • 01 Aug 2022 - ALICE system maintenance on 22 Aug 2022 - First announcement: We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
  • 01 Jun 2022 - Disabled access to old scratch storage: As previously announced, we have disabled access to the old scratch storage. We will keep the data available until 30 June 2022. Afterwards, we will start to delete data so that we can repurpose the storage within ALICE. You can request temporary access by contacting the ALICE Helpdesk. See also the wiki page: Data Storage.

Next Maintenance

System Maintenance on ALICE will take place on 22 Aug 2022 between 09:00 and 18:00 CEST (See the Maintenance Announcement)

Just Getting Started?

If you're new to ALICE, please check out the User Guide.

What more can I do with ALICE?

If you already have experience with ALICE and/or HPC, have a look at the Advanced Guide pages. Please note that many of the pages here are still under construction and subject to change.

What else is there about ALICE?

If you need more information on general topics, such as hardware, storage, and policies, take a look at the Documentation pages. Please note that many of the pages here are still under construction and subject to change.

Have a question or feedback on ALICE?

If you have a question about ALICE, need help with using it or want to give us some feedback, see the Support page to know how you can connect with us.

Status of ALICE?

Would you like to know how busy ALICE is and if all nodes are up, then have a look at the Current Status Overview.

This is a quick overview:

ALICE node status

Gateway: UP
Head node: UP
Login nodes: UP
GPU nodes: UP
CPU nodes: Up
High memory nodes: UP
Storage: UP
Network: UP

Current Issues

  • Copying data to the shared scratch via sftp:
    • There is currently an issue on the sftp gateway which does prevents users from copying data to their shared scratch directory, i.e., /home/<username>/data
    • A current work-around is to use scp or sftp via the ssh gateway and the login nodes.
    • Status: Work in Progress
    • Last Updated: 30 Nov 2021, 14:56 CET


See here for other recently solved issues: Solved Issues