July 15th, 2001

hacking

Site hiccups.

Hey all,

One of our DB servers is being uncooperative, and we've narrowed it down to two possible problems. Both of which we're working on.

There is an added delay between when you post, and when it'll show up. Posts will show up immediately part of the time, since one of the DB servers is behind. Your posts are there, given some amount of time they will show up.

[Update: OK. The DB server looks to be caught up now. Posts should be showing up without significant delay. If you're still getting a lot of site errors, I would suggest you to come back in about half an hour.]

Lagged posts

The site isn't as broken right now as it seems.

Most people seem to think posts are getting lost... they're not. They're making it to the main database fine. From there they normally distribute out to both the slave database servers. However, one of the slave servers stopped replicating so the other one is doing all the work and is doing so many reads that it doesn't have time to get its writes in.

We've mailed MySQL support about the slave that stopped and expect an answer shortly (we have a support contract with them).

This couldn't have happened at a crappier time ... we've been waiting for the place we order servers from to open on Tuesday. (They were closed Friday and will be closed tomorrow too while they move to a new building). Anyway, come Tuesday we'll be getting two more database servers and another couple web servers. We'll also be getting a Cisco 24-port switch since we're outgrowing our existing ones (both in number of ports and doing more traffic than their backplane can handle ...)

I'm reaching to come up with an intermin hack solution while we wait to hear from MySQL but it really just comes down to not having the hardware right now to keep up. Suck.

Fun fun as always, right? Anyway, no posts are getting lost. It's just lagged right now.

Update: Well, a hack solution does exist.... I changed a line in ljconfig.pl and diverted 40% of the slave traffic to the master. So now the master's busy and slow, but the slave is almost caught up. I'll back it down to 30% once the slave is caught up.