An LLM on a Stick

Alphane Moon@lemmy.world · 1 month ago

An LLM on a Stick

Alphane Moon@lemmy.world · 1 month ago

Cool concept and respect to Bing Pham for going through with this. That being said the LLM on these DIY devices is likely borderline unusable:

This is an interesting interface, but it is for a very specific use case, so it will not work well for every application. But the much bigger issue is the system’s performance. A tiny 15M parameter model works well enough, processing each token in about 200 milliseconds. But even a 77M parameter model increases that time to 2.5 seconds. Furthermore, these tiny models are not especially good, greatly limiting their utility for any practical uses.