What is the best GPU for running Deepseek locally?

Question

Ive been trying to get Deepseek running for my dev work but my old 2060 just isnt cutting it anymore its way too slow. Im looking to upgrade by this weekend since I have a big project starting Monday. I found a used 3090 for like 700 bucks on marketplace which has the 24gb vram everyone says you need but then Im also looking at a new 4080 super for the warranty and better speed. Or should I just bite the bullet and go for a 4090? My case is kind of a tight squeeze and my budget is strictly under 1300 so the 4090 might be pushing it honestly. What do you guys think is the sweet spot for Deepseek specifically?

xlqqyqnodo · Accepted Answer

Adding my two cents, I had some issues with big cards getting way too hot in my small case, it was a total letdown... if you want reliability without going broke, maybe try the ASUS Dual GeForce RTX 4070 Ti Super 16GB GDDR6X. It fits better and wont kill your budget. Check these for help:

Hugging Face for smaller quantized models

r/LocalLLM for setup guides Honestly, dont sleep on the 16gb cards for dev work...

NATREGTEGH346935NEYHRTGE · Answer

Honestly, I'd say grab that 3090 but be super careful. I bought a used card once and it literally started smoking after an hour of heavy inference... never again without testing first. For Deepseek, that 24GB of VRAM is kind of non-negotiable if you want the bigger models to run smoothly without dragging.

NVIDIA GeForce RTX 3090 24GB GDDR6X: Best for VRAM but it draws a ton of power. Make sure your PSU can handle the spikes.

NVIDIA GeForce RTX 4080 Super 16GB GDDR6X: Way more efficient, but you'll hit a wall with model size pretty quick. 16GB feels cramped for large LLMs. Since your budget is strictly under 1300, a new 4090 is definitely out. If you go with the used 3090, just make sure to run a stress test like FurMark first. Also measure your case twice because these cards are massive and might not fit if your setup is already a tight squeeze.

AngelBoast · Answer

^ This. Also, i went through a similar headache and ended up grabbing the ZOTAC Gaming GeForce RTX 4080 Super Trinity OC 16GB because i needed something reliable that wouldnt catch fire. Honestly, Im super satisfied with how it handles Deepseek. It runs cool and the drivers are rock solid, which is a huge relief when youve got deadlines.

Use 4-bit quantization to keep memory usage low while keeping high performance

Measure your case twice for the cable bend, those new connectors are stiff Its basically the perfect middle ground for dev work if you dont want the risk of a used 3090 or the massive size of most 4090s.