This week the Cartman cluster is getting the face lift. The hardware we're replacing should enable this cluster to perform at the level of our newest cluster (Porkchop), giving us a lot more breathing room to move users from other older clusters to this one as we continue our upgrades.
Unfortunately we ran in to a problem today while doing what should have been an unnoticeable part of the upgrade and the Cartman database was unavailable while we worked to remedy the problem. It is back up now and we apologize for the unannounced outage.
Our replacement hardware is almost ready to go in to production, though, so tomorrow (Wednesday) during the day we are going to make the switch to the much faster server for this cluster.
We don't foresee this causing a lot of downtime for users but we are announcing it now as a precaution. While we would generally wait until late at night to do work that might cause service interruptions, we feel the benefit of doing this work immediately outweighs the risk of a temporary outage.
Not only are we retro-fitting these servers with much better hardware, each cluster that is being upgraded is also being set up in a master-master configuration. This means we can take down one master database server for maintenance while the other remains online, with no noticeable difference to users. Being able to perform maintenance more routinely without disruption to you means we're in a better position to help prevent outages in the future.
In addition to all the work being done on existing cluster, we're ordering another brand new cluster to ensure we stay ahead of our growth.