Actions

Difference between revisions of "Latest News"

From ALICE Documentation

(Latest News)
(Latest News)
 
(24 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
=== Latest News ===
 
=== Latest News ===
*'''24 Mar. 2022 - New scratch storage available to all users: ''' We are excited to announce that the new scratch storage on ALICE is available for you to use from now on. It is a BeeGFS-powered parallel file system storage with a total capacity of 370TB. We have created a user directories for all ALICE users on the new scratch storage: <code>/data1/$USER</code> with a link in your home directory</code>/home/$USER/data1</code>. By default, you have a quota of 5TB which can be extended upon request. We ask all users to migrate their data to the new storage and adjust their workflows accordingly. See also the wiki page: [[Data storage|Data Storage]]. We will keep the old scratch storage available for you to use '''until 30 April 2022'''. Then, we will disable access to it and you will have to contact us to gain access. Another two months later, we will start to remove any remaining data on the old scratch storage. Project directories on the old shared scratch have also been set up on the new scratch storage in <code>/data1/projects/</code>, but links in home directories of project team members have not been changed in order to avoid breaking existing workflows. We ask PIs to also start migrating the data in their project directories. After the migration has been completed, we will change links in the home directories of team members. If you have any questions or need assistance for migrating your data and workflow, please do not hesitate to the ALICE helpdesk.
+
*'''06 Oct 2022 - New user wiki:''' So far, there have been separate user wikis for ALICE HPC cluster and the SHARK HPC cluster at LUMC. However, there is a great deal of overlap in terms of information that you as a user need to work on ALICE or SHARK. Therefore, the support teams of both clusters are starting to move to a new joined HPC user wiki. The new wiki is live and can be found here: [https://pubappslu.atlassian.net/wiki/spaces/HPCWIKI/ https://pubappslu.atlassian.net/wiki/spaces/HPCWIKI/]. The old wikis are now frozen and no new content will be added to them. The new wiki provides information specific to each cluster in addition to a user guide and tutorials which apply to both clusters. There is also a news section, a calendar where we publish events, information about user meetings and workshops.
*'''09 Mar. 2022 - New short partition amd-short for all users''' So far node802 has been exclusive to researchers of MI. In agreement with the PI of node802, we are making parts of the resources of this node available to all users now. This will be facilitated through a specific partition called "amd-short" that can run jobs up to 4h using up to 64 cores and up to 1TB of memory. Node802 is somewhat different than all other nodes on ALICE which is you should go through the section "[https://wiki.alice.universiteitleiden.nl/index.php?title=Running_jobs_on_ALICE#Important_information_about_partition_amd-short Important information about amd-short]" before you start using the new partition.
+
*'''21 Sep 2022 - Access to ALICE:''' On 26 Sept 2022 between 18:00 and 18:30, access to ALICE will not be possible due to maintenance on the University cloud platform.
*'''01 Feb. 2022 - Node015 is back:''' Node015 has been repaired and is back in service.
+
*'''24 Aug 2022 - ALICE available again:''' Maintenance on ALICE is over. The cluster is online again and available to all users. We apologize for the delay.
*'''12 Jan. 2022 - X2Go available on ALICE:''' We have added a new option to connect to ALICE. With X2Go, it is possible to work on ALICE using a graphical desktop environment. You can find details on how to set it up here: [[Login to ALICE using X2Go|Login to ALICE using X2Go]]
+
*'''23 Aug 2022 - ALICE system maintenance not finished and continues tomorrow:''' We managed to solve many of the issues that we faced yesterday. We are waiting for the completion of synchronization processes which are part of the high-availability setup procedure. If all goes well, we just need to run a few tests to verify that the new high-availability setup is working properly and all the nodes are coming back. Unfortunately, it was not possible to do today anymore. In case the setup fails after all, we are prepared to revert back all the changes and bring ALICE online again. In any case, we expect ALICE to be online again sometime tomorrow afternoon. We are sorry for the delay, but the new high-availability setup is vital for ALICE which is why have been working hard to get it done.
 +
*'''22 Aug 2022 - ALICE is offline due to system maintenance - Continues tomorrow:''' We encountered unexpected technical issues during our highest priority task for this maintenance day, the high-availability setup. Because this is a critical component for the continuing stability of ALICE and we require the cluster to be offline, we decided to continue solving the issues tomorrow and keep the cluster offline.
 +
*'''17 Aug 2022 - REMINDER - ALICE system maintenance on 22 Aug 2022:''' We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. The login nodes will be rebooted and all active terminal or X2Go sessions will be terminated. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
 +
*'''01 Aug 2022 - ALICE system maintenance on 22 Aug 2022 - First announcement:''' We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
 +
*'''01 Jun 2022 - Disabled access to old scratch storage:''' As previously announced, we have disabled access to the old scratch storage. '''We will keep the data available until 30 June 2022'''. Afterwards, we will start to delete data so that we can repurpose the storage within ALICE. You can request temporary access by contacting the ALICE Helpdesk. See also the wiki page: [[Data storage|Data Storage]].

Latest revision as of 12:53, 6 October 2022

Latest News

  • 06 Oct 2022 - New user wiki: So far, there have been separate user wikis for ALICE HPC cluster and the SHARK HPC cluster at LUMC. However, there is a great deal of overlap in terms of information that you as a user need to work on ALICE or SHARK. Therefore, the support teams of both clusters are starting to move to a new joined HPC user wiki. The new wiki is live and can be found here: https://pubappslu.atlassian.net/wiki/spaces/HPCWIKI/. The old wikis are now frozen and no new content will be added to them. The new wiki provides information specific to each cluster in addition to a user guide and tutorials which apply to both clusters. There is also a news section, a calendar where we publish events, information about user meetings and workshops.
  • 21 Sep 2022 - Access to ALICE: On 26 Sept 2022 between 18:00 and 18:30, access to ALICE will not be possible due to maintenance on the University cloud platform.
  • 24 Aug 2022 - ALICE available again: Maintenance on ALICE is over. The cluster is online again and available to all users. We apologize for the delay.
  • 23 Aug 2022 - ALICE system maintenance not finished and continues tomorrow: We managed to solve many of the issues that we faced yesterday. We are waiting for the completion of synchronization processes which are part of the high-availability setup procedure. If all goes well, we just need to run a few tests to verify that the new high-availability setup is working properly and all the nodes are coming back. Unfortunately, it was not possible to do today anymore. In case the setup fails after all, we are prepared to revert back all the changes and bring ALICE online again. In any case, we expect ALICE to be online again sometime tomorrow afternoon. We are sorry for the delay, but the new high-availability setup is vital for ALICE which is why have been working hard to get it done.
  • 22 Aug 2022 - ALICE is offline due to system maintenance - Continues tomorrow: We encountered unexpected technical issues during our highest priority task for this maintenance day, the high-availability setup. Because this is a critical component for the continuing stability of ALICE and we require the cluster to be offline, we decided to continue solving the issues tomorrow and keep the cluster offline.
  • 17 Aug 2022 - REMINDER - ALICE system maintenance on 22 Aug 2022: We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. The login nodes will be rebooted and all active terminal or X2Go sessions will be terminated. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
  • 01 Aug 2022 - ALICE system maintenance on 22 Aug 2022 - First announcement: We will perform system maintenance on ALICE on 22 Aug 2022 between 09:00 and 18:00 CEST. Our primary focus will be the high-availability set up of ALICE in addition to other maintenance tasks. This will require us to take all compute and login nodes of the cluster offline. It will not be possible to run any jobs and access data on ALICE. Until maintenance starts, you can continue to use ALICE as usual and submit jobs. Slurm will also continue to run your job if the requested running time will allow it to finish before the maintenance starts. If you have any questions, please contact the ALICE Helpdesk.
  • 01 Jun 2022 - Disabled access to old scratch storage: As previously announced, we have disabled access to the old scratch storage. We will keep the data available until 30 June 2022. Afterwards, we will start to delete data so that we can repurpose the storage within ALICE. You can request temporary access by contacting the ALICE Helpdesk. See also the wiki page: Data Storage.