Brad Fitzpatrick (bradfitz) wrote in lj_maintenance,
Brad Fitzpatrick
bradfitz
lj_maintenance

Another crash

The same crash as before happened again, taking down 25% of the memcaches and slowing the site. Prior to these crashes, this machine was running fine for years. Very sad.

Anyway, things are speeding up as the LJ code finds new homes for all the lost memory objects, re-fetching things from db and putting them into the 75% of memcaches that are still alive.

We're going to investigate this crash more. It's no longer an isolated incident and we can't trust this machine again until we know what the problem is.
Subscribe
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

  • 85 comments
Previous
← Ctrl ← Alt
Next
Ctrl → Alt →
Previous
← Ctrl ← Alt
Next
Ctrl → Alt →