No, you don't need a 'very bespoke AOSP' to turn your phone into a Rabbit R1 — here's proof

AnActOfCreation · 1 year ago

No, you don't need a 'very bespoke AOSP' to turn your phone into a Rabbit R1 — here's proof

ⓝⓞ🅞🅝🅔@lemmy.ca · 1 year ago

I’ve no skin in the game related to this device or software. I simply don’t care. However, in terms of an AI assistant, I am curious if there is anything on the market, including this _APK device, that is worth using. For someone who is privacy centric, advert avoidant, and security focused… does (or even can) such an app/tool exist?

db2@lemmy.world · 1 year ago

You won’t like anything on offer currently except for that which is entirely self hosted.

ⓝⓞ🅞🅝🅔@lemmy.ca · 1 year ago

What are a couple of the best self hosted options? I wouldn’t mind giving it a go on my server. I may not have enough juice with what I currently run, but perhaps as a proof of concept first and then new hardware later.

bassomitron@lemmy.world · edit-2 1 year ago

That’s a deep rabbit hole (no pun intended). I know it’s blasphemy to mention the other site around here, but check out the r/locallama subreddit. It covers more models than just LLaMA. There are literally thousands of variations at this point, so preferences are quite subjective based on your use case and your best bet is just to begin researching on your own for your intended purposes and available resources. Huggingface is the main model repository, as well.

Balder@lemmy.world · 1 year ago

Yeah, the best way out of it is to get a few of the most recommended ones and test by yourself.

andrew0@lemmy.dbzer0.com · 1 year ago

What db2 already said. Microsoft just released Phi-3 mini, which could, allegedly, run locally on newer smartphones.

If I understood correctly, the Rabbit thingy just captures your information locally and then forwards it to their server. So, if you want more power, you could probably do the same by submitting the same info to a bigger open source model than Phi-3, like Llama 3, hosted on your homelab. I believe you can set it up with huggingface/gradio, which sort of provides an API that you could use.

That way, you don’t need a shitty orange box, and can always get the latest open source models with a few lines of code. There are plenty of open source frameworks in the works at the moment, and I believe that we’re not far off from having multi-modal LLMs running on homelab-level hardware (if you don’t mind a bit of lag).

ⓝⓞ🅞🅝🅔@lemmy.ca · 1 year ago

huggingface! i found it once and never could remember again who hosted all those models. thank you!

i use a different device than homelab, but now I am curious what I may be able to achieve with my syno system. it’s hw weak, so probably not a lot. but i would like to give it a go. if it’s decent, i may consider another device for the purpose.

andrew0@lemmy.dbzer0.com · 1 year ago

Good luck! You can try the huggingface-chat repo, or ollama with this web-ui. Both should be decent, as they have instructions to set up a docker container.

I believe the Llama 3 models are out there in a torrent somewhere, but I didn’t dig to find it. For the 70B model, you’ll probably need around 64GB of RAM available, but the 7B one should run fine with just 8GB. It will be somewhat slow though, compared to the ChatGPT experience. The self-attention mechanism can be parallelized, which is why you will see much better results on a GPU. According to some others that tested it, if you offload some stuff to RAM, you could see ~10-12 tokens per second on an RTX 3090 for certain 70B models. But more capable ones will be at less than 1 token per second, all depending on the context window you use.

If you don’t have a GPU available, just give the Phi-3 model a try :D If you quantize it to 4 bits, it can apparently get 12 tokens per second on an iPhone haha. It should play nice with pooling information from a search engine, or a vector database like milvus, qdrant or chroma.

ⓝⓞ🅞🅝🅔@lemmy.ca · 1 year ago

Thank you for all this! Much appreciated!

abhibeckert@lemmy.world · edit-2 1 year ago

ChatGPT 4 is a great assistant, I find it indispensable… I use it on my phone and computer but would like it in a dedicated device.

Privacy? Yeah it’s not great, but that’s mitigated by OpenAI focusing the product hard on areas that don’t really need privacy.

I do think these tools can be private - but to get there we need more RAM on our computers and phones, and it needs to be expensive high bandwidth RAM, which costs a fortune right now. A lot of research is being done to reduce memory requirements and more manufacturing capacity for memory is being ramped up.

ⓝⓞ🅞🅝🅔@lemmy.ca · 1 year ago

So you are using OpenAI’s app? Do you have it integrated into your phone? What are the main features that you use (beyond asking questions like one does from their app/site)?