• @armchair_progamer
    link
    5
    edit-2
    11 months ago

    But aren’t the GPUs used by AI different than the GPUs used by gamers? 8GB of RAM isn’t enough to run even the smaller LLMs, you need specialized GPUs with 80+GB like A100s and H100s.

    The top-tier consumer models like the 3090 and 4090 have 32GB, with them you can train and run smaller LLMs locally. But there still isn’t much demand to do that because you can rent GPUs on the cloud for cheap; enough that the point where renting exceeds the cost of buying is very far off. For consumers it’s still too expensive to fine-tune your own model, and startups and small businesses have enough money to rent the more expensive, specialized GPUs.

    Right now GPU prices aren’t extremely low, but you can actually but them from retailers at market price. That wasn’t the case when crypto-mining was popular

    • Skua
      link
      fedilink
      211 months ago

      Not all applications of it are as demanding as LLMs though. The company referenced in the article as buying likely upwards of twenty million dollars worth of graphics cards seems to work mostly on driver assistance (presumably with the goal of full self-driving at some point, but I haven’t looked at them thoroughly). It seems quite possible that however their product works is within the capacity of a GPU that would otherwise be popular for gaming, and it being installed in cars means that it has to have the capabilities in the physical device that’s in the car

    • acedelgado
      link
      fedilink
      211 months ago

      They’re not that different, really. CUDA processing cores are the most used in AI training, and those are the main processors used in both Nvidia’s consumer desktop cards and machine learning enterprise cards. As “AI” is on the rise, more and more of the supply of CUDA processors and VRAM chips will be diverted to enterprise solutions that will fetch a higher price from deals with corporations. Meaning there will be less materials available for the consumer-level GPU supply, which will drive prices up for normal consumers. NVIDIA has been banking on this for a long time; that’s why they don’t care about overpricing the consumer market and have been trying to push people towards cloud-based GeForce Now subscription models where you don’t even own the hardware and just basically rent the processing power to play games.

      Also just to be anal, the 3090 and 4090 have 24Gb of vram, not 32Gb. And unlike gaming nowadays you can distribute the workload to multiple GPU’s in one system, or over a network of machines.