I hope this won’t be counted as some form of self-promotion, even though I am sharing a post from my own blog.

As a tech worker who works in a Cloud shop, I wanted to elaborate the many reasons why I find working with Clouds terrible, from multiple points of view.

I tried to organize my thoughts in a (relatively long) post, in which both technical aspects and political aspects (which are very related) are covered.

I am sure many people will have different perspectives, and this could be potentially also a nice prompt for a discussion.

  • nexusband@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    5 months ago

    the storage is built so it doesn’t break so easily. I trust AWS engineers more than Mike, no matter how cool Mike is to hang out with. Additionally, if the storage breaks while Mike is on vacation we’re screwed, with the cloud you get a whole team 24/7 on it.

    That’s easily mitigated just following established standards. Redundancy is cheaper than anything else in the aftermath and documentation can be done easy with automation.

    you can prevent data loss with backups or multi-region setups with a few clicks/terraform lines. Try telling the PO that you need to rent datacenter space in Helsinki and Singapore for redundancy…

    You don’t, you rent rack space in a location far enough away but close enough to get the data in a few hours.

    It’s neither superior, easier or less risky, it’s just a shift in responsibility. And in most cases, it’s so expensive that a second or third on site engineer is payed for.

    • Tja
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      5 months ago

      And what is simpler and faster, renting rack space in another continent (and buying, shipping, racking and initializing) or editing your terraform file?

      • nexusband@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        5 months ago

        Why on another continent? Except maybe VDI, some direct calls to some LLM or some insane scales, there’s nothing really that needs those round trip times.

        • ErrorCode@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 months ago

          Also data rules / data privacy. Some things need to have the original in Europe; China & Russia also need their data separated from others.

        • Tja
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          2
          ·
          5 months ago

          Because the customer demands it.

      • loudwhisper@infosec.pubOP
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 months ago

        Not OP, but they are comparable efforts, especially since it’s a relatively infrequent activity. You can rent dedicated boxes with off-the-sheld hardware almost instantly, if you don’t want to deal with the hardware procurement, and often you can do that via APIs as well. And of course both options are much, much, much cheaper than the Cloud solution.

        For sure speed in general is something Cloud provide. I would say it’s a very bad metric though in this context.

        • Tja
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          5 months ago

          My last customer (global insurance company) provisions several systems a day. Now moving to hundreds via Jenkins. Frequency is environment dependent.

          • loudwhisper@infosec.pubOP
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 months ago

            If your compute needs expand that much everyday, and possibly shrink in others, than your use-case is one that can benefit from Cloud (I covered this in the post).

            That said, if provisioning means recycle, then it’s obviously not a problem.

            This is a very rare requirement. Most companies’ load is fairly stable and relatively predictable, which means that with a proper capacity planning, increasing compute resources is something that happens rarely too. So rarely that even a lead time for hardware is acceptable.

            So if I may ask (and you can tell), what is the purpose of provisioning that many systems each day? Are they continuously expanding?

            • Tja
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              5 months ago

              Agree to disagree. Banking, telecommunications, insurance, automotive, retail are all industries where I have seen wild load fluctuations. The only applications where I have seen constant load are simulations: weather, oil&gas, scientific. That’s where it makes sense to deploy your own hardware. For all else, server less or elastic provisioning makes economic sense.

              Edit to answer the last question: to test variable loads, in the last one. Imagine a hurricane comes around and they have to recalculate a bunch of risk components. But can be as simple as running CI/CD tests.

              • loudwhisper@infosec.pubOP
                link
                fedilink
                English
                arrow-up
                1
                ·
                5 months ago

                Systems are always overspecced, obviously. Many companies in those industries are dynosaurs which run on very outdated systems (like banks) after all, and they all existed before Cloud was a thing.

                I also can’t talk for other industries, but I work in fintech and banks have a very predictable load, to the point that their numbers are almost fixed (and I am talking about UK big banks, not small ones).

                I imagine retail and automotive are similar, they have so much data that their average load is almost 100% precise, which allows for good capacity planning, and their audience is so wide that it’s very unlikely to have global spikes.

                Industries that have variable load are those who do CPU intensive (or memory) tasks and have very variable customers: media (streaming), AI (training), etc.

                I also worked in the gaming industry, and while there are huge peaks, the jobs are not so resource intensive to need anything else than a good capacity planning.

                I assume however everybody has their own experiences, so I am not aiming to convince you or anything.

                • Tja
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  5 months ago

                  Banking is extremely variable. Instant transactions are periodic, I don’t know any bank that runs them globally on one machine to compensate for time zones. Batches happen at a fixed time, are idle most of the day. Sure you can pay MIPS out of the ass, but you’re much more cost effective paying more for peak and idling the rets of the day.

                  My experience are banks (including UK) that are modernizing, and cloud for most apps brings brutal savings if done right, or moderate savings if getting better HA/RTO.

                  Of course if you migrate to the cloud because the cto said so, and you lift and shift your 64 core monstrosity that does 3M operations a day, you’re going to 3nd up more expensive. But that should have been a lambda function that would cost 5 bucks a day tops. That however requires effort, which most people avoid and complain later.

                  • loudwhisper@infosec.pubOP
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    5 months ago

                    Instant transactions are periodic, I don’t know any bank that runs them globally on one machine to compensate for time zones.

                    Ofc they don’t run them on one machine. I know that UK banks have only DCs in UK. Also, the daily pattern is almost identical everyday. You spec to handle the peaks, and you are good. Even if you systems are at 20% half the day everyday, you are still saving tons of money.

                    Batches happen at a fixed time, are idle most of the day.

                    Between banks, from customer to bank they are not. Also now most circuits are going toward instant payments, so the payments are settled more frequently between banks.

                    My experience are banks (including UK) that are modernizing, and cloud for most apps brings brutal savings if done right, or moderate savings if getting better HA/RTO.

                    I want to see this happening. I work for one and I see how our company is literally bleeding from cloud costs.

                    But that should have been a lambda function that would cost 5 bucks a day tops

                    One of the most expensive product, for high loads at least. Plus you need to sign things with HSMs etc., and you want a secure environment, perhaps. So I would say…it depends.

                    Obviously I agree with you, you need to design rationally and not just make a dummy translation of the architecture, but you are paying for someone else to do the work + the service, cloud is going to help to delegate some responsibilities, but it can’t be cheaper, especially in the long run since you are not capitalizing anything.