Imo NPU for average consumer usage is still niche (compared to something like CUDA/Tensor cores). Just like TPU, NPU is just another hardware acceleration for specific tasks that most of average user can be fulfilled with CPU/GPUs.
I would like to know how well the NPUs would work on more specialized tasks.
AI up-scaling of video takes about ~30 minutes for a 5 min SD min video (sometimes it can be longer) on my 3080.
I’ve seen reviews reference the 3080 having ~238 TOPS, which implies that current gen NPUs would take significantly longer in upscaling.
I would be willing to get a dedicated NPU if it offered significantly faster AI upscaling relative to high-end GPUs. Having software and an ecosystem provided by the NPU manufacturer would also be nice.
I see you seem fascinated by NPUs, here is probably one of the most recent publications that discuss performance comparison between NPU, GPU, and CPU for Edge computing case using YoloV5 model. And here’s another benchmark provided by Google for its TPU product (Google Coral).
Given the circumstances, it’s unlikely for current generation NPU to fulfill those heavy data tasks like upscaling video or generating video but for something like Edge computing they provide fairly performant solutions at affordable prices (in the case of NPU you can get them on Single Board Computer using Rockchip SoC especially RK3588 one). If you’re using Raspberry Pi 5 or other SBC, you can get Google Coral (M.2 or USB version).
Thanks for those links. Will take a look.
I am honestly surprised we don’t see more “power user” type benchmarks for measuring AI Compute for use cases such as video upscaling, local LLM (my next project, I use ChatGPT as an elaborate spell checker, but I want to switch away from corporate cloud solutions).
Cheers!