It seems that self hosting, for oneself, a federated service, like Lemmy, would only serve to increase the traffic in the network, and not actually serve the purpose of load balancing between servers.

As far as I understand it, the way federation is supposed to work is that the servers cache all the content locally to then serve to the people that are registered to that server. In doing so, the servers only have to transmit a minimal amount of data between themselves which lowers the overhead for small servers – this then means that a small server doesn’t get overwhelmed by a ton of people requesting from it. Now, if, instead, you have everyone self hosting their own server, you go right back to having everyone sending a ton of requests to small servers, thereby overwhelming them. It seems that it’s really only beneficial to the network if you have, say, hundreds of medium sized servers instead of, say, thousands, of very small servers. While there is the resilience factor, the overhead of the network would be rather overwhelming.

Perhaps one possibility of fixing this is to use some form of load balancer like IPFS to distribute the requests more evenly, but I am no where even remotely close to being knowledgeable enough in that to say anything definitively.

  • cstine@lemmy.uncomfortable.business
    link
    fedilink
    English
    arrow-up
    30
    ·
    edit-2
    1 year ago

    ActivityPub is not a distributed network: you don’t have communications between servers in a mesh, the server that owns a community(ex. [email protected]) pushes out JSON data to any subscribers.

    Small servers won’t talk directly to each other, unless they’re subscribed to communities on each other so having a lot of small servers doesn’t actively impact the load on each other, but only on the larger servers that have the more active communities.

    And, even then, the JSON requests are going to be a lower impact than a user actively browsing the site, though probably only marginally and maybe not in all cases.

    • SJ0@lemmy.fbxl.net
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      One big difference between the json requests and a user callling for the site directly is your instance pulls all the data all the time, whereas a user only pulls the data they use themselves.

      • cstine@lemmy.uncomfortable.business
        link
        fedilink
        English
        arrow-up
        8
        ·
        1 year ago

        Just to be pedantic, it’s not pull, it’s push: the data is POSTed from the server that hosts the community.

        Right now loading a page makes a bunch of API queries to pull all the related data for the posts, votes, sidebar info, and so on AND the API is very untuned and sending way more data than the WebUI/a client needs to actually generate a page: hence my ‘it’s less efficient’ comment, though this is certainly something that can be tweaked to improve performance between the back and frontends.

        I will, however, admit that this is only true if someone is actually reading the content they’re subscribed to. The ‘subscribe to everything’ scripts turn this math on its head because now you are using resources to gather data you don’t care about.

      • redcalcium@lemmy.institute
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Actually, the instance you federate with will push the data, and they do so at their own leisure (configurable by the instance admin), and the data itself is already created in the queue (minimal database load), so it definitely have lower impact than actual users browsing the site.

  • fubo@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    ·
    1 year ago

    No, it’s not harmful.

    However, if you be a butt, your whole instance will get banned.

  • cerevant@lemmy.world
    link
    fedilink
    English
    arrow-up
    16
    ·
    1 year ago

    Not harmful, but I would agree that the network seems optimized for a small number of user-focused servers, and a large number of community-focused servers.

    • Die4Ever
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      1 year ago

      Yeah I feel like keeping communities on instances that have no users can really lighten the load since they don’t actually ask for anything from other instances, they act like a funnel for all the requests. I do this with my own communities-only instance and it has extremely low cpu load and storage needs, because there’s no users pulling stuff in, only posts going to it.

      I think if you load the website for a single post on instance A, that’s probably more strain on instance A compared to federating the post over to instance B and loading the website for it through instance B. But then you have to think about the federation of each comment and upvote/downvote. So maybe it is a little more strain for instance A. I don’t think this stuff is worth worry about too much but probably the ideal efficiency is an instance with users who all have the same interests.

      • cerevant@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        That’s pretty much my thinking, though there is an advantage that having a large number of users on an instance amplifies it’s caching effect, though as you say - if their interests are too far spread, that effect is diminished.

  • Eskuero@lemmy.fromshado.ws
    link
    fedilink
    English
    arrow-up
    7
    ·
    1 year ago

    I guess it might increase the load slightly but I think individual instances actually do more harm when they go down and disappear because of the timeouts.

    I run my own individual lemmy instance where I’m by myself because I don’t want the extra legal burden of random users signing up here but I also host two communities with around 100 subscribers and not duplicated on other servers so I guess it works to actually contribute to decentralization of the federation.

  • Spzi@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    not actually serve the purpose of load balancing between servers.

    That’s probably okay. Not all instances have to be the same in this regard. Each can decide for itself what strategy to follow, if and how much to grow.

    Instances which host communities will have traffic and either find a way to handle the growth, or stop growing at some point. If service quality suffers, people probably migrate to other instances, kind of a long term load balancing.