My current system is running on an old 2U HP rackmount server with dual 16-core AMD Opteron-6262HE CPU’s and two RAID-5 arrays (fast SSD array and slow 2.5" HDD array). There are generally 5-6 VMs running under a Linux master at a given time but none of them are using a whole lot of CPU cycles.
In general, it’s noisy but fairly effective for my needs.
I’m looking at the future and what might be good replacement that offers a blend of power-efficiency, flexibility, and storage cost.
In particular, I’d like to:
-
Ditch the 2.5" HDD array in favor of an efficient separate storage system, preferably an attached NAS with 3.5" disks on RAID5 but probably actually networked and not USB based (both for reliability and also so I can potentially provide storage directly to stuff running on separate SBC’s etc). A storage system I could drop in now and still use after I upgrade the compute system would be great
-
I’d like to keep the SATA-SSD array for stuff that needs faster disk, or possibly move up to a RAID’ed M2/NVMe.
-
Move up to a more modern CPU that has a good Power-per-watt balance. 8-16 cores totally is probably good if that can be reasonably power efficient for idle cores etc, but dropping some VM’s to run stuff on the aforementioned SBC’s is also an option
-
Still be rack-mounted for the main system, but not so freaking loud, and actually fit in a standard 24" deep rack
-
Potentially be able to add a decent GPU or add-on board for processing AI models etc
Generally what it will be running is a bunch of VM’s for stuff like NextCloud, remote-admin software, Media servers (Plex/Jellyfin), a Fileserver, some virtual desktops and various other fairly low-power VMs, BUT it’d be nice if I could add the dGPU or something with the horsepower for AI processing and periodic rendering/ripping/etc
I’m sorry debating on whether might make more sense to move all storage to BAD, then just replace the always-running stuff (NextCloud, Plex,Fileserver) with SBC’s so that they’re fairly easily swappable if something fails.
You’re asking for a lot of different things that don’t align, so instead of trying to guess what you need, let me just throw a few things out there:
- if you want transcoding that isn’t constantly tied to GPU, you want an AMD chip of some sort. If you’re trying to be efficient, an APU model.
- you probably want to just merge all your data into a single disk array. Running multiple on one system is pointless and inefficient.
- separate your CPU needs from storage. Like you said, maybe you just need an ultra low power NAS, and then different machine to handle compute.
- skip SSD for any network storage. Yes it’s more power efficient, but if you’re talking bulk storage, you’re wasting resources for smaller volumes where it won’t matter if transferred over the network. SSD is good for local-only for the most part.
I think you mean an Intel CPU for transcoding. Pretty much every piece of software can use QuickSync for hwaccel. My Celeron NAS can handle 3 simultaneous Plex transcodes as well as a 5-camera Frigate setup without breaking a sweat.
To OP - if you want energy efficiency AND ML, you will want pre-baked models that can be used on an edge compute device like Google Coral.
Good points.
Also, SSD isn’t always necessarily more power efficient than spinning disks. It depends on the specific disks, and the use-case.
I’ve seen a table posted on Lemmy with data on different drive power consumption for idle, Read, and write. Sometimes SSD consumed more power.
I’m looking at the future and what might be good replacement that offers a blend of power-efficiency, flexibility, and storage cost.
Any modern CPU will improve energy efficiency. AMD AM4 platform and Intel N100 are very cheap. AMD SP3 platform is also very cheap and has a ton of PCIe lanes and memory expandability for gpus, NVMe, and running lots of VMs.
For storage cost, used hdds are currently $8/TB, generic NVMe is currently $45, and used enterprise SSDs are $50/TB, decide which you want to invest in depending on your needs. Used enterprise drives will be the fastest and most durable for databases and RAID.
https://www.ebay.com/sch/i.html?_nkw=pm983+m.2&_trksid=p4432023.m4084.l1313
SSD prices are expected to decrease faster than HDD prices, and will probably overtake HDDs for value in ~5 years.
About dGPUs, Intel A310 is the best transcoding Gpu out there. Used Nvidia datacenter gpus probably have the best vram/$ and pytorch compatibility for AI
This is from 2023, but this might still be a good read on how to get to a very low watt idle build, there’s a few more post on other older builds on that blog as well: https://mattgadient.com/7-watts-idle-on-intel-12th-13th-gen-the-foundation-for-building-a-low-power-server-nas/
The other responses have so far talked about hardware setup, so I’m not going to do that. Instead I’m looking at your software setup: VMs can be comparatively power inefficient compared to containers, specially for always-on services that idle often.
Perhaps my recent NAS/home server build can serve as a bit of an inspiration for you:
- AMD Ryzen 8500G (8 cores, much more powerful than your two CPUs, with iGPU)
- Standard B650 mainboard, 32 GB RAM
- 2 x used 10 TB HDDs in a ZFS pool (mainboard has 4x SATA ports)
- Debian Bookworm with Docker containers for applications (containers should be more efficient than VMs).
- Average power consumption of 19W. Usually cooled passively.
I don’t think it’s more efficient to separate processing and storage so I’d only go for that if you want to play around with a cluster. I would also avoid SD cards as a root FS, as they tend to die early and catastrophically.
Oh hell yeah. I wouldn’t trust an SDCard to anything important except maybe a Pi where the actual OS is fairly unimportant and the data is stored elsewhere.
I had been wondering about the G series Ryzen. Is this running in a standard tower or something rackable?
I’m running it in a regular mATX case (Node 804) but I think you can also get AM5 motherboards in rack-mount cases.
My 7700x is 5 times that wattage. Granted, it gas 128gb, a380, 4 hdd, 2 SSD, 40gb nic, tpu, and 25 VMs running on it.
The lesson here is that I’ve way over-spec’d my machine.
That system also sounds a lot more capable than mine. How did you end up with 25 VMs?
It’s not 5x capable, is my point.
About 10 of those VMs are running a single docker image. It runs great but I know better now.
opnsense
home assistant
neolink
NextCloud
Pihole
Frigate
Omada controller
Photoprism
Wireguard server node
Jellyfin
Transmission-daemon
Audiobookshelf
Plex
Arr stack
Caddy
Librespeed
Invidious
Openspeedtest
OpenMediaVault
VaultWarden
Paperless-ngx
Rustdesk
Proxmox Backup Server
3 or 4 desktop images to mess around with