March 2nd, 2002

Cluster 2 going read-only for a few...

We have a new server that's not yet in production that's been giving us a few problems since we got it. Its identical twin we bought at the same time is working fine and is in production (Chef; cluster 2 master).

We ran some diagnostic software on the misbehaving server (Santa) and it's reporting some motherboard/processor errors which may be false errors. To verify the correctness of the diagnostic suite, I want to take down Chef for about 10 minutes to run that part of the test and see if it passes or fails. If it fails also, then I hope the problem is the memory in Santa that's bad (which I'll find out also when I go down to the NOC... those tests should be done).

Basically, we want to make Santa the new Cluster 3 master, but I can't do that until I know what's been causing its few random crashes.

Santa's twin, Chef, is the Cluster 2 master. While it's down, you'll still be able to read cluster 2 journals, but not comment in them or post new entries (if you're on that cluster). Right now the userinfo page doesn't say what cluster you're on, so if you're curious, go to this temporary page.

I'll let you all know what I find out. In the meantime, help out my wrists and reply to anybody that's confused here and point them at /support/. Thanks!

Scratch that

The aforementioned DB maintenance won't be happening for a day or so. There are some other things I want to do first.


I'm in the process of making some changes ... there may be intermittent errors for 10 minutes or so, and new comments/posts might not be visible during that time.

All will be well shortly. Just have to wait for some files to copy to the cluster slaves.

Details for geeks here.