• 13 Posts
  • 42 Comments
Joined 3 个月前
cake
Cake day: 2025年8月27日

help-circle








  • Exactly what an LLM-agent would reply. 😉

    I would say that the LLM-based agent thinks. And thinking is not only “steps of reasoning”, but also using external tools for RAG. Like searching the internet, utilizing relationship databases, interpreters and proof assistants.

    You just described your subjective experience of thinking. And maybe a vauge definition of what thinking is. We all know this subjective representation of thinking/reasoning/decision-making is not a good representation of some objective reality (countless of psychological and cognitive experiments have demonstrated this). That you are not able to make sense of intermediate LLM reasoning steps does not say much (except just that). The important thing is that the agent is able to make use of it.

    The LLM can for sure make abstract models of reality, generalize, create analogies and then extrapolate. One might even claim that’s a fundamental function of the transformer.

    I would classify myself as a rather intuitive person. I have flashes of insight which I later have to “manually” prove/deduc (if acting on the intuition implies risk). My thought process is usually quite fuzzy and chaotic. I may very well follow a lead which turns out to be dead end, and by that infer something which might seem completely unrelated.

    A likely more accurate organic/brain analogy would be that the LLM is a part of the frontal cortex. The LLM must exist as a component in a larger heterogeneous ecosystem. It doesn’t even have to be an LLM. Some kind of generative or inference engine that produce useful information which can then be modified and corrected by other more specialized components and also inserted into some feedback loop. The thing which makes people excited is the generating part. And everyone who takes AI or LLMs seriously understands that the LLM is just one but vital component of at truly “intelligent” system.

    Defining intelligence is another related subject. My favorite general definition is “lossless compression”. And the only useful definition of general intelligence is: the opposite of narrow/specific intelligence (it does not say anything about how good the system is).



  • How are humans different from LLMs under RL/genetics? To me, they both look like token generators with a fitness. Some are quite good. Some are terrible. Both do fast and slow thinking. Some have access to tools. Some have nothing. And they both survive if they are a good fit for their application.

    I find the technical details quite irrelevant here. That might be relevant if you want to discuss short term politics, priorities and applied ethics. Still, it looks like you’re approaching this with a lot of bias and probably a bunch of false premises.

    BTW, I agree that quantum computing is BS.






  • sniktatoOpen Source@lemmy.mlServo 0.0.1 released!
    link
    fedilink
    arrow-up
    1
    arrow-down
    2
    ·
    1 个月前

    I’m curious about why they aren’t stable yet. It’s been a while. And there should be a lot incentives. Like cash. If one wants some.

    My guess is that something is not right with the project or the chosen technology. And that some other project will be first to deliver the new memory safe browser engine reference implementation. Maybe I’m just being grumpy.



  • GPT-OSS:120b is really good.

    Tools are powerful and make local inference on cheap hardware good enough for most people.

    DSPy is pretty cool.

    Intel caught my attention at the begin of 2025, but seems to have given up on their software stack. I regret buying cheap Arcs for inference.

    Inference on AMD is good enough for production.



  • Oh, that’s quite fancy hardware.

    Hmm… Unless exllama is explicitly recommended by NVIDIA for that particular GPU and setup, it seems “risky”. vLLM seems to be the popular choice for most “production” systems. I’m switching from llama.cpp to vLLM because of better performance and its the engine recommended by most model providers. I don’t really have the time to benchmark, so I’ll just do what the documentation says. And it’s really hard to do good benchmarks. Especially when “qualitative language performance” can vary for the same weights on different hardware/software.

    With that kind of hardware, I would do exactly what NVIDIA and your model provider(s) say. Otherwise you might waste a lot of GPU power.


  • TLDR: Yes, it matters. Especially when it comes to inference and “new” features and hacks it relies on.

    What GPU and what inference engine are you using?

    On Debian I would use the stable version (not old stable) and I would enable nonfree firmware and also the backports version of the kernel and nonfree firmware. Then you’re probably set for a year or two.

    An old kernel with only free firmware likely performs much worse. Look at the release logs of the Linux kernel and any GPU driver.

    If your hardware is very old, it probably doesn’t matter super much. But sometimes it does (like when a manufacturer decides to unlock some sleeping feature in an old forgotten device).