February 7th, 2002

neomlee

current stability issues.

hey,

Yes, we are aware and are working on the current stability issues. Our master database server has developed a new and interesting problem. I'm hoping that it's going to get a kernel and software upgrade tonight, but I haven't heard from brad in an hour or two. I assume he's on his way to the NOC, or is already there.

Until then, there will be intermittent problems with the site, and there's absolutely nothing I can do about it. I can't fix that machine from here.

Kernel upgrade

The kernel on Jesus (the master db server) really sucks. It's long-overdue for an upgrade. Gonna do that in about 30 minutes.

Downtime will be ~5 minutes while it reboots.

But it'll be worth it... we've had way more than 5 minutes of downtime a day lately because of Jesus doing absolutely stupid things. Hopefully this should fix it all up.

In any case, once all users are moved to clusters we won't need such a beefy machine for the master and we can probably find a new creative use for it.

Anyway, too much talk. I'm off with evan to the NOC. We're also racking the 3rd cluster master server. (right now we just have 1 cluster live... and cluster "0")