- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
deleted by creator
Once we have super fast reliable internet we’ll likely have the whole computer as a service. We’ll just have access terminals basically and a subscription with a login, except for the nerds who want their own physical machine.
Bro just reinvented mainframes.
They’ve been reinvented repeatedly. Citrix, terminal servers, thin clients, cloud desktops, web apps, remote app delivery…
Most people (not necessarily here) need a web browser and an office program. Most people are well suited to terminals or something like a Chromebook.
I need actual hardware for my job and hobbies, but even I have a mini PC set up like a gaming console so that if I want to play games on my bedroom TV I don’t have to hook up my Steam Deck or gaming laptop. I just stream them.
Microsoft: Write it down! Write it down!
& thin clients
No. Just no.
And get off my lawn, ya whippersnapper.
RAM as a service can’t happen. It’s just far too slow. The whole computer can though. It’s RAM can be local so it can access it quickly, then it just needs to stream the video over, which is relatively simple if creating some amount of latency to deal with.
Mhmm… Computer as a service. Why does that sound familiar…?
You have to hand it to the French though, that stuff was pretty dope.
I was there, Gandalf. I was there, three thousand years ago.
Given how so many of us communicate, work, and compute using cloud platforms and services, we’re basically already there.
How many apps are basically just a dumb client using a REST API?
you will own nothing and be happy!
Wait, we already had that in the 70s.
You have to know that some dinosaur at ibm is screaming about how they gave up the centralized computer and is salivating over gigabit fiber so he can charge everyone 15 bucks a month to use an ibm mainframe.
Stadia almost didn’t suck, I bet we’re 10 years from phones just being hand terminals that tap into a local server and desktops won’t be far behind.
Like in The Expanse?
Exactly like the expanse.
Fucking love those books, am listening to one now.
Have you seen the Amazon show at all? How did you feel about it?
For many of us Stadia didn’t suck at all, except for the game library and Google lack of commitment.
sweaty gamers and nerds as always unite over having proper physical PCs rather than online services or consoles.
Given the digital literacy of many “regular people” (e.g. my father, and seemingly every other of my friends), the idea is appealing. Especially, as most of them don’t care about privacy. Give them decent availability, and they will throw money at you. And if you also give them support, I will, too.
That’s exactly how it works right now with VDI. I’m using one at work.
Honestly, cloud gaming is very good… when it is good. Sometime it suck. But when it’s good it’s incredible how much it feels like gaming locally.
Unsubscribe
It’ll never be fast enough. An SSD is orders of magnitude slower than RAM, which is orders of magnitude slower than cache. Internet speed is orders of magnitude slower than the slowest of hard drives, which is still way too slow to be used for anything that needs memory relatively soon.
Need faster than light travel speeds and we can colocate it on the moon
A SATA SSD has ballpark 500MB/s, a 10g ethernet link 1250MB/s. Which means that it can indeed be faster to swap to the RAM of another box on the LAN that to your local SSD.
A Crucial P5 has a bit over 3GB/s but then there’s 25g ethernet. Let’s not speak of 400g direct attach.
- modern NVMe SSDs have much more bandwidth than that, on the order of > 3GiB/s.
- even an antique SATA SSD from 2009 will probably have much lower access latency than sending commands to a remote device over an ethernet link and waiting for a response
Show me an SSD with 50GB/s, it’d need a PCIe6x8 or PCIe5x16 connection. By the time you RAID your swap you should really be eyeing that SFP+ port. Or muse about PCIe cards with RAM on them.
Speaking of: You can swap to VRAM.
My point was more that the SSD will likely have lower latency than an Ethernet link in any case, as you’ve got the extra delay of data having to traverse both the local and remote network stack, as well as any switches that may be in the way. Additionally, in order to deal with that bandwidth you’ll need to kit out not only the local machine, but also the remote one with expensive 400GbE hardware+transceivers, plus switches, and in order to actually store something the remote machine will also have to have either a ludicrous amount of RAM (resulting in a setup which is vastly more complex and expensive than the original RAIDed SSDs while offering presumably similar performance) or RAIDed SSD storage (which would put us right back at square one, but with extra latency). Maybe there’s something I’m missing here, but I fail to see how this could possibly be set up in a way which outperforms locally attached swap space.
Maybe there’s something I’m missing here
SFP direct attach, you don’t need a switch or transcievers, only two QSFP-DD ports and a cable. Also this is a thought exercise not a budget meeting. Start out with “We have this dual socket EPYC system here with full 12TB memory and need to double that”. You have *rolls dice* 104 free PCIe5 lanes, go.
Bandwidth isn’t really most of the issue. It’s latency. It’s the amount of time from the CPU requesting a segment of memory to receiving it, which bandwidth doesn’t effect.
Depends on your workload and access pattern.
…I’m saying can be faster. Not is faster.
Yeah, but the point of RAM is fast random (the R in RAM) access times. There are ways to make slower memory work better for this by predicting what will be needed (grab a chunk of memory because accesses will probably need things with closer locality than pure random), but it can’t be fixed. Cloud memory is good for non-random storage or storage that isn’t time critical.
So I could download more RAM?
You can do it today, just put your swapfile on sshfs and you’re done.
It will crash as soon as it needs to touch the swap due to the relatively insane latency difference.
So use a small area in memory as cache
the infinite memory paradox. quaint. (lol)
It’s just a NUMA architecture. Linux can handle it.
Imagine doing this on a dial-up 56K modem
A:\SPICYMEMES\MODEMSOUND.WAV
Bwa-hahahahhah "A:" 🤣
bro is still using floppies
all the cool kids use iomega
For those too young to remember, the A:\ drive was for the hard 3" floppy disks and B:\ drive was for the soft 5.25" floppy disks. The C:\ drive was for the new HDDs that came out, and for whatever reason the C:\ drive became the standard after that.
A was the first floppy drive and B the second floppy drive (in dos and cp/m). The type of drive was irrelevant.
the A:\ drive was for the hard 3" floppy disks and B:\ drive was for the soft 5.25" floppy disks.
FWIW they were the other way around on my system. The order of A:\ vs B:\ depended on their order on the cable (“first” and “second”), not type.
wait, didn’t some tech youtubers like LTT try using cloud storage as swap/RAM? afaik they failed because of latency
This guy used ICMP data payload as a hard drive. It kinda worked.
I remember using ICMP data to bypass my high school’s firewall. TCP and UDP were very locked down, but they allowed pings. It was slow though - I think I managed to get a few KB per sec. Maybe there’s faster/fancier firewall bypass methods these days. This was back in the 2000s when an entire school would have a single OC-1 fiber connection.
155mbps Telco trunk line for a school? Nicer school than I went to.
Around 50Mbps: https://en.wikipedia.org/wiki/Optical_Carrier_transmission_rates#OC-1
I only had dialup at the time, and the fastest home broadband available was 1.5Mbps ADSL, so it was pretty fancy!
Afaik they used it as redundant off-site backup
I wonder if there would be a speed boost by setting 2 gdrive as raid 0 for off site backups
The limiting factor is mostly your upload speed. And also you need to have a good QoS set up, or you have very limited internet usability. Where as on-site you can get way higher speeds for cheaper
I feel like this might be a giant gaping security risk.
So is pretty much all of the cloud services the average user already subscribes to. People still use them though.
Agreed. This is especially bad, though, because if it’s compromised they basically have hardware-level access to your machine. Unless you’re using encrypted swap, and I’m not sure how standard that is.
Well, assuming you’ve already gone through the effort to write a custom kernel module to offload your swap pages to Google Drive, it doesn’t seem like that much of a stretch to have it encrypt the data before transmitting it.
Is that what this would take? Then yeah, you’d hope somewhere in the process you consider this.
Obviously you should set up device mapper to encrypt the gdrive device then put the swap on the encrypted mapper device.
If your kernel isn’t using 90% of your CPU resources, are you really even using it to it’s full potential? /s
Oh wow, I didn’t even know Gdrive offered a 1 petabyte option 😂
They don’t to my knowledge, I believe that’s mounted through rclone which just usually sets the filesystem size to 1PB so that it doesn’t have to try to query what the actual limit is for the various providers (and your specific plan).
Once upon a time, Google offered unlimited drive storage as part of some GSuite tiers. They stopped offering it a while ago and have kicked most/all legacy users off of it in the past few months. It was glorious while it lasted 😢
Guess they ran everyone out of business that they needed to, so now the premium features get yanked and your choice of alternatives is curtailed. Hooray for enshittification.
It’s not that, it’s that people were abusing it by using it for things like Plex with 100TB+ of data, which cost Google more than the revenue they got as a result. Blame the people that abused the policy. They’re not a charity and can’t keep an offer if they lose money as a result. Keep in mind that Google Drive data has several replicas and is also backed up to cold storage on LTO tapes, so people abusing the storage policy is actually pretty expensive for them .
They do still have unlimited data in some cases, for example with custom plans for large companies (like 50k+ employees).
And Google docs/sheets/slides used to not count in your used space.
At one point they offered unlimited storage for Play Music only. You could literally upload your entire collection. They changed it later to consume your Drive storage. Cheap enough plans so I subscribed. Then they killed off Play Music. I’m still salty about that.
Yea where do you get that? I can’t see anything on their pricing page, only goes up to 2tb
Even better:
Free cloud storage that doesn’t require an account and provides no limit to the volume of data stored
The image doesn’t load.
I posted that 10 months ago.
That being said, it seems to still work for me.
deleted by creator
Protip: Put swapfile on ramdisk for highest speed
Unironically that’s how zram works
Don’t do boy zram dirty, it has a ton of utility when you have ample spare compute and limited RAM.
Is that not how it works though? Lol
Doesn’t it compress the contents that it’s storing to help kind of get the best of both worlds?
You get faster storage because it’s in ram still, but with it being compressed there’s also “more” available?
I could be completely mistaken though
You are correct, although zram uses more cpu power since it compresses things. It’s not really an issue if you’re not using a potato :=)
even if you are using a potato it probably doesn’t have much ram so slightly slowing it to make things run smoother is a very popular choice
Today I learned!
Hopefully that swap is on an SSD, otherwise that query may not ever finish lol
Once you’re deep into swap, things can get so slow that there’s no recovering from it.WHAT FUCKING QUERY ARE YOU RUNNING TO USE UP THAT MUCH MEMORY DAMN
In a database course I took, the teacher told a story about a company that would take three days to insert a single order. Thing was, they were the sort of company that took in one or two orders every year. When it’s your whole revenue on the line, you want to make sure everything is correct. The relations in that database were checked to hell and back, and they didn’t care if it took a week.
Though that would have been in the 90s, so it’d go a lot faster now.
What did they produce? Cruiseships?
No idea, but I imagine it was something big like that, yes. I think it was in northern Wisconsin, so laker ships are a good guess.
We have a company like that here somewhere. When they have one job a year, they have to reduce hours, if they have two, they are doing OK, and if they have three, they have to work overtime like mad. Don’t ask me what they are selling, though. It is big, runs on tracks, and fixes roads.
A very very badly written one no doubt…
Why stop at just one full table scan?
Exactly how I plan to deploy LLMs on my desktop 😹
You should be able to fit a model like LLaMa2 in 64GB RAM, but output will be pretty slow if it’s CPU-only. GPUs are a lot faster but you’d need at least 48GB of VRAM, for example two 3090s.
Amazon had some promotion in the summer and they had a cheap 3060 so I grabbed that and for Stable Diffusion it was more than enough, so I thought oh… I’ll try out llama as well. After 2 days of dicking around, trying to load a whack of models, I spent a couple bucks and spooled up a runpod instance. It was more affordable then I thought, definitely cheaper than buying another video card.
As far as I know, Stable Diffusion is a far smaller model than Llama. The fact that a model as large as LLaMa can even run on consumer hardware is a big achievement.
Both SD 1.5 and SDXL run on 4g cards, you really want fp16 though.
In principle it should be possible to get decentish performance out of e.g. an RX480 by using the (forced) 32-bit precision to do bigger winograd convolutions (severely reducing the number of
fma
s needed) but don’t expect AMD to write kernels for that, ROCm is barely working on mid range cards in the first place.Meanwhile, I actually ended up doubling my swap because 16G RAM are kinda borderline to merge SDXL models. OOM might kick in, it might not, and in any case your system is going to lock without earlyoom.
I had couple 13B models loaded in, it was ok. But I really wanted a 30B so I got a runpod. I’m using it for api, I did spot pricing and it’s like $0.70/hour
I didn’t know what to do with it at first, but when I found Simply Tavern I kinda got hooked.
*laughs in top of the line 2012 hardware 😭
I need it just for the initial load on transformers based models to then run them in 8 bit. It is ideal for that situation
That does make a lot of sense
Same. I’m patient
I’d be lying if I said I hadn’t done something similar before.
Wrote my master thesis this way - didn’t have enough ram or knowledge, but plenty of time on the lab machine, so I let it do its thing over night.
Sorry, lab machine ssd.
It gave its life for academic achievement, there is no finer death for hardware. o7
You really need to index your tables. This has all the hallways of a Cartesian cross product.
I dunno why I didn’t realize you can add more swap to a system while running. Nice trick for a dire emergency.
Even better, you can
swapoff
swap too!It’s Linux it’s made by people with a brain.
Does the OOM killer actually work for anyone? In every linux system I’ve used, if I run out of memory, the system simply freezes.
Absolutely can and will take action. Doesn’t always kill the right process (sometimes it kills big database engines for the crime of existing), but usually gives me enough headroom to SSH back in and fix it myself.
I have limited experience with Linux, but why is it that when my system locks up, SSH still tends to work and let me fix things remotely? Like, if the system isn’t locked up, let me fix it right here and now and give me back control, if it is locked up, how is SSH working to help me?
So that’s the nifty thing about Unix is that stuff like this works- when you say “locked up”, I’m assuming you refer to logging in to a graphical environment, like Gnome, KDE, XFCE, etc. To an extent, this can even apply to some heavy server processes: just replace most of the references to graphical with application access.
Even lightweight graphical environments can take a decent amount of muscle to run, or else they lag. Plus even at a low level, they have to constantly redraw the cursor as you move it around the screen.
SSH and plain terminals (Ctrl-Alt-F#, what number is which varies by distro) take almost no resources to run: SSH/Getty (which are already running), a quick process call to the password system, then a shell like bash or zsh. A singular GUI application may take more standing RAM at idle than this entire stack. Also, if you’re out of disk space, the graphical stack may not be able to alive
So when you’re limited on resources, be it either by low spec system or a resource exhaustion issue, it takes almost no overhead to have an extra shell running. So it can squeeze into a tiny corner of what’s leftover on your resource-starved computer.
Additionally, from a user experience perspective, if you press a key and it takes a beat to show up, it doesn’t feel as bad as if it had taken the same beat for your cursor redraw to occur (which also burns extra CPU cycles you may not be able to spare)
Thanks, great answer!
Yes, it takes surprisingly long for the OOM killer to take action, but the system unfreezes. Just wait a few minutes and see whether that does the trick.
Yes. If you have swap the system will crawl to a halt before the process is killed though, SSDs are like a thousand times slower than RAM. Swapoff and allocate a ton of memory to see it in action.
Nvme PCIe 4 SSDs are quite fast now tho, you can get between DDR1 and DDR2 speeds from a modern SSDs. This is why Apple are using their SSDs as swap quite aggressively. I’m using a MacBook Pro with 16 GBs of RAM and my swap usage regularly goes past 20 GBs and I didn’t experience any slowdown during work.
Depends if the allocated memory is actively used or not. Some apps do not require a large amount of random access memory, and are totally fine with a small part of random access memory and a large part of not so random access and not so often used memory.
Alternatively I can imagine that MacOS simply has a damn good algorithm to determine what can be moved to swap and what cannot be moved to swap. They may also be using the SSD in SLC mode so that could contribute to the speedup as well.
It never kicks in for me when it should, but I figured out I can force trigger it manually with the magic SysRq key (Alt+SysRq+F, needs to be enabled first), which instantly recovers my system when it starts freezing from memory pressure.
Alt+SysRq+F, needs to be enabled first
Do note that this opens up a security hole. Since this can kill any app at random and is not interceptable, if you leave your PC in a public place, someone could come up and press this combo a few times. Chances are, it’ll kill whatever the locking app you’re using.
Oh yes. I’ve had massive compiles (well linking) which failed because of the OOM killer, and I did exactly the same, massive swap so it will just keep going. So what if it’s using disk as RAM and unusable for a few hours in the middle of the night, at least it finishes!
Yeah, default Ubuntu LTS webserver kicked the mysqld on a stupid query (but it worked on dev - all developers, someday) not too long ago…
it does for me, usually by killing my session and throwing me back to the login screen
Slow SSD issue. RAM is for chumps.
Poor man’s Optane
Swap thrashing goes brrrrrrrrrrrrr
I’ve actually done something similar with a 2GB ram machine… 2GB ram / 8GB zswap, actually ran way faster lol
Yeah it works surprisingly well. I installed Gentoo on a 2005 era laptop a few years ago and had to keep adding zswap until Rust could compile for Firefox. Iirc it took about 12G of zswap to get it working, but it wasn’t too bad overall.
I actually ended up doing something like this on my main desktop lol, 16GB ram/ 32GB swap… I hate closing programs
Just download more RAM.
Jokes on you I used to have a 128gb ssd just for swap in my laptop