Brad Fitzpatrick (bradfitz) wrote in lj_maintenance,
Brad Fitzpatrick

Directory up

The directory's back up. The reason it went down most recently was because the slave db got screwed when the master was crashing all the time. The master has since been fixed, but the slave is now in an inconsistent state. Fixing the slave requires taking down the master and taking a snapshot. We can't run the directory off the main database on the master, but --- we can probably run the directory off a separate database on the master. So that's what I'm trying now. I realized that the directory search only involves 6 or 7 really small tables, so I wrote a script to safely lock everything and copy them over to the other database. This'll eat CPU time on the master, but it won't lock tables on the main database while directory queries are running.

This weekend we'll probably take the master down for awhile one night and take another snapshot, since we'll have a second slave machine ready to install by then. (supposed to arrive today)

P.S. Whoever text messaged me using the emergency form asking when the directory will be back is really annoying. That's not what the form is for. Didn't you even read the disclaimer? If people abuse the form, I take the form down, then I keep sleeping next time the site goes down. In other words, don't cry wolf. I didn't put the directory back up because you complained and woke me up ... I put it back up because I want it up.
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened