User's Page: Pleione & Titan maintenance
Both clusters are in pre-maintenance state due to slurm upgrade.
Maxtime is set to 1 hour.
Shutdown will be initiated without any further notification.
Added by Timo Eronen almost 8 years ago
Both clusters are in pre-maintenance state due to slurm upgrade.
Maxtime is set to 1 hour.
Shutdown will be initiated without any further notification.
Added by Timo Eronen almost 8 years ago
Both clusters are ready to use now.
Added by Timo Eronen almost 8 years ago
Due to new kernel both clusters will be rebooted as soon as all queues have been drained.
The partition max. runtime has been set to 1 hour to allow monitoring jobs execute.
Depending on the currently running jobs' max. time settings, the reboot will happen at latest one week from now.
The reboot will initiated without any further notice!
Added by Timo Eronen almost 8 years ago
Both clusters have now 'fake' compute node #99 which is dedicated to Grid jobs (including cluster monitoring).
This means that all 'normal' compute nodes (titan nodes 1-9 and pleione nodes 1-32) can be freely used for real work without disturbing monitoring jobs.
I.e. you can take over all the nodes from normal, small, big, all (whatever, except grid) partitions.
Added by Timo Eronen almost 8 years ago
Nodes pl17, pl22 and pl99 IB cables fixed.
Cluster is ready to use.
Added by Timo Eronen almost 8 years ago
The cluster is in pre-reboot state and max runtime has been decreased to 1 hour. Once all partitions has been drained the cluster will be rebooted.
Added by Timo Eronen almost 8 years ago
Titan reboot scheduled for 23.1 already done.
Partitions maxtime restored to one week so the cluster is ready to use.
Added by Timo Eronen almost 8 years ago
Pleione reboot scheduled for 23.1 already done.
Partitions maxtime restored to one week so the cluster is ready to use.
Added by Timo Eronen almost 8 years ago
Due to new kernel both clusters need reboot.
All partitions' max runtime is set to 1 hour until the reboot. So, don't panic if (when) your (new) job is put on hold until the reboot and max runtime restored.
Added by Timo Eronen about 8 years ago
Titan is now configured according to FGCI rules. The two most significant changes are:
- HyperThreading is disabled
- 20% of resources are reserved for Grid usage
NOTE: The number of logical cores is the same as number of physical cores for all compute nodes:
ti1 : 48 cores (four 12 core CPUs) ti2 - ti9 : 24 cores (two 12 core CPUs per node)
Also available in: Atom