• sweng
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    6 months ago

    What do you mean “full set if data”?

    Obviously you can not train on 100% of material ever created, so you pick a subset. There is a a lot of permissively licensed content (e.g. Wikipedia) and content you can license (e.g. Reddit). While not sufficient for an advanced LLM, it certainly is for smaller models that do not need wide knowledge.

    • the_doktor@lemmy.zip
      link
      fedilink
      arrow-up
      1
      ·
      6 months ago

      You can’t even rely on Wikipedia to be right, and how is reddit “content you can license”? Its articles are owned by their sites, and the original stuff posted there is from the poster and is usually wildly inaccurate or outright wrong (or even downright dangerous). And even when they do pull in tons of stuff they shouldn’t, the results are frequently laughably wrong.

      You’re not making a good argument for LLM crap here. Just accept the fact that it’s a failed technology that needs to be shut down. Please. How are people so excited and gung-ho over this garbage, failed, laughably bad technology? It’s almost like people WANT chaos.

      • sweng
        link
        fedilink
        arrow-up
        1
        ·
        6 months ago

        Wikipedia is no less reliable than other content. There’s even academic research about it (no, I will not dig for sources now, so feel free to not believe it). But factual correctness only matters for models that deal with facts: for e.g a translation model it does not matter.

        Reddit has a massive amount of user-generated content it owns, e.g. comments. Again, the factual correctness only matters in some contexts, not all.

        I’m not sure why you keep mentioning LLMs since that is not what is being discussed. Firefox has no plans to use some LLM to generate content where facts play an important role.

        • the_doktor@lemmy.zip
          link
          fedilink
          arrow-up
          1
          ·
          6 months ago

          Sure hasn’t helped AI/LLMs with accuracy yet. And never will. Computing doesn’t actually think and reason, it’s just mashing together bits of data it has, and if what it has now isn’t accurate, how is anything going to be?

          You and others continue to harp on how great this new technology is and meanwhile we have seen it do nothing but absolutely, laughably fail. You keep saying it will get better, but it won’t. It is limited by the fact that computers don’t work that way. Sick and tired of the people justifying this garbage “tech” that is stealing art, code, text, etc, sucking up huge amounts of power, and giving wrong information, telling people to do dangerous things and even kill themselves because computers don’t know the difference.

          Just admit it. AI/LLM is garbage. Please. Stop being a massive fanboy for something that has clearly, evidently, 100% failed miserably and dangerously.

          • sweng
            link
            fedilink
            arrow-up
            1
            ·
            6 months ago

            I think you are replying to the wrong person?

            I did not say it helps with accuracy. I did not say LLMs will get better. I did not even say we should use LLMs.

            But even if I did, non of your points are relevant for the Firefox usecase.