Georgia Advanced Computing Resource Center: Difference between revisions
Jump to navigation
Jump to search
(outage timeline) |
No edit summary |
||
| Line 46: | Line 46: | ||
power up Panasas | power up Panasas | ||
power up ESX servers and VMs | power up ESX servers and VMs | ||
power up racks 8,9,10,11 | power up racks 8,9,10,11 | ||
| Line 62: | Line 61: | ||
W: thumper upgrades | W: thumper upgrades | ||
W: rccstor upgrades | W: rccstor upgrades | ||
switch and test NFS mounts on pcluster | switch and test NFS mounts on pcluster | ||
Revision as of 12:19, 21 August 2012
Aug 22 outage
"W" means "don't know When this task will be done"
Soon: /etc/motd on pcluster, zcluster
Sometime Tuesday: message to users
2 PM: VM snapshots
3 PM:
disable logins pcluster
disable logins zcluster (except for jkissing, students)
kick users off pcluster
kick users off zcluster?
drain all nodes or queues, pcluster
disable all queues except somedevq, zcluster
kill all jobs pcluster
kill all jobs zcluster
do GE jobs testing MPI and storage I/O throughput
stop execd on all nodes, zcluster
W: shut down racks 8,9,10,11
4 PM:
shut down VMs (but must have place to do storage shutdowns from)
W: shut down 3070s
W: shut down Panasas
W: enable Panasas jumbo frames
W: VMware updates
W: connect ESX IPMI cat5
W: "final rsyncs" of /db, /usr/local
W: Curtis reconfig storage unit NICs/LAGs
W: PanFS 16K blksize on remaining nodes and zcluster.rcc
**** AFTER ******
CC configs NICs on storage for VLAN 20, maybe
power up 3070s
power up Panasas
power up ESX servers and VMs
power up racks 8,9,10,11
switch and test /db, /usr/local mounts on the zcluster
do GE jobs testing MPI and storage I/O throughput
W: GE upgrade
W: yum updates of nodes
W: yum update of zhead
W: update FW on Dells
W: move some rack15 nodes to rack 16?
W: reinstall rack11?
W: thumper upgrades
W: rccstor upgrades
switch and test NFS mounts on pcluster
W: upgrade PGI compiler
reenable queues on zcluster
resume queues on pcluster
enable logins pcluster
enable logins zcluster