Actions

Difference between revisions of "Latest News"

From ALICE Documentation

(Latest News)
(Latest News)
Line 2: Line 2:
 
*'''8 Oct. 2021 - Infiniband network back in operation'''. The broken Infiniband switch has been replaced and the Infiniband network is working again. You can make use of the Infiniband network again for your jobs on the CPU partitions.
 
*'''8 Oct. 2021 - Infiniband network back in operation'''. The broken Infiniband switch has been replaced and the Infiniband network is working again. You can make use of the Infiniband network again for your jobs on the CPU partitions.
 
*'''8 Oct. 2021 - Node020 and node859 used for testing''' Node020 and node859 will be reserved from time to time to continue testing the new BeeGFS storage system.
 
*'''8 Oct. 2021 - Node020 and node859 used for testing''' Node020 and node859 will be reserved from time to time to continue testing the new BeeGFS storage system.
*'''30 Aug. 2021 - Node020 reserved to testing''' We have been working on the configuration of the new BeeGFS storage system. To this purpose, we have reserved node020 for running tests.
+
*'''30 Aug. 2021 - Node020 reserved to testing''' We have been working on the configuration of the new BeeGFS storage system. To this purpose, we have reserved node020 for running tests.
*'''23 Jul. 2021 - Leiden University network maintenance on 31 Jul/01 Aug:''' Maintenance on the network of Leiden University will take place on the weekend of 31 July/01 August. During this time ALICE will continue to run, but in total isolation, i.e., with no internet access. This means that you will not be able to login to ALICE and jobs cannot for example pull code, download data or access license servers. During the maintenance, the status will be tracked here [[News#Next_Maintenance|Next maintenance]]
 
*'''29 Jun. 2021 - ALICE system maintenance finished (Update):''' System maintenance has finished and ALICE is available again.
 
** However, <s>two</s> one issue remains.
 
** <s>Login node 1 is down due to technical issues on the node. Login2 is running and can be used instead. Connections that are intended to login1 are automatically routed to login2. There should be no need to change your ssh configs. </s>
 
** The Infiniband network is down due to technical issues on the Infiniband switch.
 
** List of changes:
 
*** Login node 1 is running and the NVIDIA Tesla T4 has been integrated successfully. Instructions on using the T4 will follow soon.
 
*** Slurm version 20.11.7 is now running on ALICE
 
*** EasyBuild 4.4.0 is used for the Intel and AMD branch
 
*** The partitions notebook-gpu, notebook-cpu, playground-cpu, playground-gpu have been removed.
 
*** The time limit on the mem partition has been changed from Infinite to 14 days.
 
*** Resources on the testing partitions are now limited to 15 CPUs per node, a maximum amount of memory per node of 150G, a default memory per cpu of 10G.
 

Revision as of 11:58, 8 October 2021

Latest News

  • 8 Oct. 2021 - Infiniband network back in operation. The broken Infiniband switch has been replaced and the Infiniband network is working again. You can make use of the Infiniband network again for your jobs on the CPU partitions.
  • 8 Oct. 2021 - Node020 and node859 used for testing Node020 and node859 will be reserved from time to time to continue testing the new BeeGFS storage system.
  • 30 Aug. 2021 - Node020 reserved to testing We have been working on the configuration of the new BeeGFS storage system. To this purpose, we have reserved node020 for running tests.