What is the best GP...
 
Notifications
Clear all

What is the best GPU for running DeepSeek locally?

4 Posts
5 Users
0 Reactions
233 Views
0
Topic starter

I really gotta get DeepSeek-V2 running for this dev project I'm finishing by next weekend. I've been looking at the 3090 because of the 24GB VRAM and people say its the budget king for local LLMs but then I see folks on Reddit swearing by the 4090 for speed. My logic was that more memory is better for the big models but if it takes ten seconds per word then whats the point? I only have about $1600 to drop on this right now so I cant just buy both. If I get a used 3090 will it actually handle the 67B version well enough or am I gonna regret not getting something newer? Need to decide like today so I can get it shipped...


4 Answers
11

I went through this exact mess last year. Picking a used NVIDIA GeForce RTX 3090 24GB GDDR6X was the safer bet for my budget back then.

  • 3090: Solid 24GB, handles the 67B model okay with some offloading.
  • NVIDIA GeForce RTX 4090 24GB GDDR6X: Much faster but way more expensive. Honestly, just get the 3090. It works fine for dev work without breaking the bank.


11

Just seeing this now... honestly, if you want to run the 67B version properly, a single 24GB card is gonna be a struggle. Even with the raw speed of a NVIDIA GeForce RTX 4090 24GB GDDR6X, you simply wont have enough VRAM to fit the model weights without heavy quantization or slow system RAM offloading. Since your budget is around $1600, the most methodical route is looking for two used NVIDIA GeForce RTX 3090 24GB GDDR6X cards. This gives you 48GB of total VRAM which is way more reliable for these larger weights. It is a bit more of a hassle regarding power requirements and cooling, but it ensures you actually have the memory headroom required for the task. The 4090 is an amazing piece of tech but memory capacity is usually the hard limit for these deep learning dev projects.


3

In my experience, VRAM capacity is your absolute bottleneck here. The NVIDIA GeForce RTX 4090 24GB GDDR6X is faster, but speed doesn't matter if the model spills into slow system RAM. At $1600, consider these points:

  • Memory bandwidth on the 3090 is still elite for LLM inference.
  • DeepSeek 67B needs roughly 40GB for usable context at 4-bit quantization. Honestly, grab a used NVIDIA GeForce RTX 3090 24GB GDDR6X now and save for a second card later.


2

Man, hardware hunting is just a cycle of disappointment these days. Honestly, this whole situation reminds me of when my buddy tried to build a local rig for some heavy dev work last winter. He found what he thought was a steal on some high-end cards from a local liquidator, but unfortunately, it turned into a total disaster. Every time he tried to load up a big model, the whole system would just hard crash because the thermal pads were basically disintegrating. We spent three straight nights repasting everything and it still wasnt as good as expected... actually it was worse. He ended up spending so much on extra cooling and a new PSU that he could have just bought a server grade unit from the jump. It was such an ordeal that he eventually just sold the whole lot for parts and went back to using cloud APIs for his project. Just a complete headache from start to finish.


Share: