Summary
The HPC cluster Apocrita is currently unavailable due to a critical issue with the underlying storage. As a precaution, the cluster is rejecting SSH logins, Globus transfers and OnDemand access until the issue has been resolved. The ITS Research team are currently working with the storage vendor to resolve the problem as soon as possible.
We would like to apologise for any inconvenience caused.
INC/255642
INC/255649
Update 12/5/23 21:20: We’ve re-enabled access to the HPC cluster, Globus and OnDemand. All storage systems are currently available. We have an ongoing resiliency issue that we are tracking with the vendor and are awaiting new parts. We hope that replacing the parts will not affect service, but to limit potential impact we’re only permitting short jobs to run over the weekend.