Actions

ALICE User Documentation Wiki

From ALICE Documentation

Revision as of 19:31, 16 June 2021 by Schulzrf (talk | contribs) (What's new with ALICE?)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Off to research computing Wonderland


Welcome to the ALICE HPC user documentation.

ALICE is a computing facility for research and education of Leiden University. With ALICE you have the world of computing at your fingertips. On this wiki, you find the information you need to get started and become more skilled in using a compute cluster for research and education.

We appreciate any questions and comments on the content of the documentation so that we can improve the information that we supply here.

If you are unsure about where to go next, have a look below.

What is ALICE?

The About ALICE pages give some background information, a quick overview and how to acknowledge ALICE in your publications.

How can I get an account?

The page Getting an Account explains how to request an account on ALICE.

What's new with ALICE?

To get information about updates, upgrades, events, planned maintenance and more, have a look at the News page.

Here is the most recent news:

Latest News

  • 8 Oct. 2021 - Infiniband network back in operation. The broken Infiniband switch has been replaced and the Infiniband network is working again. You can make use of the Infiniband network again for your jobs on the CPU partitions.
  • 8 Oct. 2021 - Node020 and node859 used for testing Node020 and node859 will be reserved from time to time to continue testing the new BeeGFS storage system.
  • 30 Aug. 2021 - Node020 reserved to testing We have been working on the configuration of the new BeeGFS storage system. To this purpose, we have reserved node020 for running tests.

Next Maintenance

Leiden University network maintenance on 31 Jul/01 Aug

Maintenance on the network of Leiden University will take place on the weekend of 31 July/01 August.

During this time ALICE will continue to run, but in total isolation, i.e., with no internet access. This means that you will not be able to login to ALICE and jobs cannot for example pull code, download data or access license servers.

We will use this page to provide updates on the status of the cluster.

If you have any question, please contact the ALICE Helpdesk.

Just Getting Started?

If you're new to ALICE, please check out the User Guide.

What more can I do with ALICE?

If you already have experience with ALICE and/or HPC, have a look at the Advanced Guide pages. Please note that many of the pages here are still under construction and subject to change.

What else is there about ALICE?

If you need more information on general topics, such as hardware, storage, and policies, take a look at the Documentation pages. Please note that many of the pages here are still under construction and subject to change.

Have a question or feedback on ALICE?

If you have a question about ALICE, need help with using it or want to give us some feedback, see the Support page to know how you can connect with us.

Status of ALICE?

Would you like to know how busy ALICE is and if all nodes are up, then have a look at the Current Status Overview.

This is a quick overview:

ALICE node status

Gateway: UP
Head node: UP
Login nodes: UP
GPU nodes: UP
CPU nodes: UP
High memory nodes: UP
Storage: UP
Network: UP

Current Issues

  • Infiniband network down:
    • Due to an issue on the Infiniband switch, the Infiniband network is currently down and out-of-service.
    • The Infiniband switch is being repaired.
    • We have replaced the broken switch and the new one is working. The Infiniband network is available again.
    • Status: SOLVED
    • Last Updated: 08 Oct 2021, 13:54 CEST
  • Copying data to the shared scratch via sftp:
    • There is currently an issue on the sftp gateway which does prevents users from copying data to their shared scratch directory, i.e., /home/<username>/data
    • A current work-around is to use scp or sftp via the ssh gateway and the login nodes.
    • Status: Work in Progress
    • Last Updated: 19 Apr 2021, 12:17 CET


See here for other recently solved issues: Solved Issues