January 12th, 2004

Cluster 1 slave

My apologies to anybody who's on cluster 1 and whose "recent entries page" was only showing their most recent posts every-other reload or so.

The problem was that one of the DB slaves in that cluster stopped updating from its master and our tools didn't detect it. So sometimes you'd hit that stopped database and get old stuff.

That machine is now out of rotation while it catches up. We'll be fixing our tools to catch this type of problem in the future.

new code

Some new code's going live related to generating BML pages (pages that look like the site and not like a journal). There should be no noticable changes. If so, report it to support and they'll aggregate bug reports and pass 'em along. There may be a few second window of errors, so double-check any error before reporting it. Thanks!

New servers

Short version: We're doing upgrades.

Long version:
After our upgrade last week we've been bottlenecked by CPU, not disk, so we ordered 4 new web servers.

lisa's currently at the datacenter, installing 2 new switches (networking equipment) for our new cabinet and the 4 new web servers.

Also in the past week I've been going crazy profiling the code (looking for slow spots) and rewriting it to be more CPU-friendly, which means we get more "miles per gallon" from our existing servers, essentially.

So things were slow for the latter third of the today, but we should have enough power come tomorrow.

Also going on currently is a move from everybody from cluster 1 (Cartman) to cluster 8 (Porkchop). Once everybody's off Cartman (all 280k people) we'll be upgrading the Cluster 1 machines to be crazy fast (like Porkchop) and then moving users from other clusters to Cartman while we upgrade those. So basically shuffling everybody around while we upgrade any hardware that's upgradable.