Best GPU for running DeepSeek-V3 locally?

Question

Hey everyone! I’m really hyped about the DeepSeek-V3 release, but I’m struggling to figure out the best hardware for a smooth local setup. Since it’s a massive Mixture-of-Experts model, my current RTX 3080 definitely won't cut it for that parameter count. I’m debating whether to save up for a single RTX 4090 or if I should look into a multi-GPU setup with used 3090s to stack enough VRAM for a decent quantization. I mainly want to use it for heavy coding tasks and local RAG. Does anyone have experience running V3 yet? What's the most cost-effective GPU configuration to get usable tokens per second without going into enterprise-grade hardware?

flenhdddxg · Accepted Answer

Quick question - what is ur actual budget?? I reallyyy wanna help cuz I struggled so much with VRAM lately!!

• NVIDIA GeForce RTX 4090 24GB: amazing speed but limited VRAM for DeepSeek-V3.
• 2x used NVIDIA GeForce RTX 3090 24GB: way more VRAM for way less cash.

tbh VRAM is literally EVERYTHING for huge MoE models like this. peace! ✌️

GreenwichMeridian · Answer

just saw this! Curious about one thing: whats ur actual budget? unfortunately my NVIDIA GeForce RTX 3080 10GB had issues with quants...

• maybe look for used NVIDIA GeForce RTX 3090 24GB cards for like $700? idk

xrrdpfsxkd · Answer

Yo, so DeepSeek-V3 is a huge Mixture-of-Experts model with over 600B parameters. Basically, that means VRAM is ur biggest hurdle. I think a single NVIDIA GeForce RTX 4090 24GB just wont cut it for a decent quant... I would suggest grabbing two or three used NVIDIA GeForce RTX 3090 24GB cards instead. Stacking them is way more cost-effective for local RAG tasks. Plus, it works way better than enterprise stuff for the price. gl!