Image of a comic.

If you can’t reproduce a bug, sometimes you need to comb through the logs for clues. Some tips:

  • filter out irrelevant lines (for example with grep -v)
  • find 1 failed request and search for that request’s ID to get all the logs for that request
  • build a timeline: copy and paste log lines (and your interpretations!) into a document
  • if you see a suspicious log line, search to make sure it doesn’t also happen during normal operation
  • if there’s a cascade of errors, find the first error that started the problems

