• Swiggles@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    91
    arrow-down
    1
    ·
    1 year ago

    This happens. Recently we had a problem in production where our database grew by a factor of 10 in just a few minutes due to a replication glitch. Of course it took down the whole application as we ran out of space.

    Some things just happen and all head room and monitoring cannot save you if things go seriously wrong. You cannot prepare for everything in life and IT I guess. It is part of the job.

    • RidcullyTheBrown@lemmy.world
      link
      fedilink
      English
      arrow-up
      23
      arrow-down
      2
      ·
      1 year ago

      Bad things can happen but that’s why you build disaster recovery into the infrastructure. Especially with a compqny as big as Toyota, you can’t have a single point of failure like this. They produce over 13,000 cars per day. This failure cost them close to 300,000,000 dollars just in cars.

      • Swiggles@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        6
        ·
        1 year ago

        Yea, fair point regarding the single point of failure. I guess it was one of those scenarios that should just never happen.

        I am sure it won’t happen again though.

        As I said it can just happen even though you have redundant systems and everything. Sometimes you don’t think about that one unlikely scenario and boom.