Actions

Difference between revisions of "Latest News"

From ALICE Documentation

(Latest News)
(Latest News)
 
(37 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
=== Latest News ===
 
=== Latest News ===
*'''29 Jun. 2021 - ALICE system maintenance finished (Update):''' System maintenance has finished and ALICE is available again.
+
*'''17 Aug 2022 - REMINDER - ALICE system maintenance on 22 Aug 2022:''' We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. The login nodes will be rebooted and all active terminal or X2Go sessions will be terminated. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
** However, <s>two</s> one issue remains.
+
*'''01 Aug 2022 - ALICE system maintenance on 22 Aug 2022 - First announcement:''' We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
** <s>Login node 1 is down due to technical issues on the node. Login2 is running and can be used instead. Connections that are intended to login1 are automatically routed to login2. There should be no need to change your ssh configs. </s>
+
*'''01 Jun 2022 - Disabled access to old scratch storage:''' As previously announced, we have disabled access to the old scratch storage. '''We will keep the data available until 30 June 2022'''. Afterwards, we will start to delete data so that we can repurpose the storage within ALICE. You can request temporary access by contacting the ALICE Helpdesk. See also the wiki page: [[Data storage|Data Storage]].
** The Infiniband network is down due to technical issues on the Infiniband switch.
 
** List of changes:
 
*** Login node 1 is running and the NVIDIA Tesla T4 has been integrated successfully. Instructions on using the T4 will follow soon.
 
*** Slurm version 20.11.7 is now running on ALICE
 
*** EasyBuild 4.4.0 is used for the Intel and AMD branch
 
*** The partitions notebook-gpu, notebook-cpu, playground-cpu, playground-gpu have been removed.
 
*** The time limit on the mem partition has been changed from Infinite to 14 days.
 
*** Resources on the testing partitions are now limited to 15 CPUs per node, a maximum amount of memory per node of 150G, a default memory per cpu of 10G.
 
*'''28 Jun. 2021 - ALICE system maintenance continues tomorrow:''' During our maintenance, we encountered a few issues with the Infiniband switch and login node 01. Because of the issues, we also did not finish updating the GPU nodes. We will continue working on these item tomorrow (Tuesday, 29 June 2021) until at least 12:00.  ALICE will remain offline for maintenance.
 
*'''27 Jun. 2021 - ALICE offline for system maintenance:''' More information here [[News#Next_Maintenance|Next maintenance]].
 
*'''25 Jun. 2021 - System maintenance on ALICE:''' ALICE will undergo system maintenance on '''28 June 2021'''. More information here [[News#Next_Maintenance|Next maintenance]].
 
*'''2 Jun. 2021 - Rclone available on ALICE:''' Rclone is available on ALICE and there are instructions on how to set it up to transfer files to and from SurfDrive and ResearchDrive: [[Data_Transfer |Data transfer to and from ALICE]]. This is a new feature and feedback on your experience is very welcome.
 
*'''29 Apr. 2021 - ALICE User Survey 2021 closed:''' The ALICE User Survey 2021 is closed. We have received responses from 76 users. We are thrilled to have this many contributions. Thank you very much for participating in the survey. We will go through all the answers now and share results from the survey here on the wiki with you.
 
*'''12 Feb. 2021 (Update 22 Feb. 2021) - SSH Connection Stability:''' If you recently started experiencing that your ssh connection is breaking up after a few minutes of being idle, please check the settings below for you ssh configuration for ALICE. If this does not solve the issue, please contact the ALICE Helpdesk.
 
** for Linux, MacOS, Windows using OpenSSH command line connection: Make sure you use "ServerAliveInterval 60" and "ServerAliveCountMax 3" to your ssh config settings.
 
** MobaXterm: Go to Settings -> SSH -> SSH settings and enable "SSH keepalive"
 
**PuTTY: Go to Settings -> Connection -> Set a non-0 value in "Settings between keepalives" (e.g., 60)
 

Latest revision as of 14:39, 17 August 2022

Latest News

  • 17 Aug 2022 - REMINDER - ALICE system maintenance on 22 Aug 2022: We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. The login nodes will be rebooted and all active terminal or X2Go sessions will be terminated. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
  • 01 Aug 2022 - ALICE system maintenance on 22 Aug 2022 - First announcement: We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
  • 01 Jun 2022 - Disabled access to old scratch storage: As previously announced, we have disabled access to the old scratch storage. We will keep the data available until 30 June 2022. Afterwards, we will start to delete data so that we can repurpose the storage within ALICE. You can request temporary access by contacting the ALICE Helpdesk. See also the wiki page: Data Storage.