What is the best GP...
 
Notifications
Clear all

What is the best GPU to run DeepSeek v4 Flash locally?

2 Posts
4 Users
0 Reactions
2 Views
0
Topic starter

I need to get DeepSeek v4 Flash running locally by Friday for this client demo here in Seattle and honestly I'm a bit stuck on the VRAM requirements. My logic was that a single 4090 with 24GB would handle the 4-bit quantization no problem since it's the 'Flash' version but the context window is giving me some weird OOM errors when I try to push it. I've been running LLMs for years but this v4 architecture seems to be eating more memory than I anticipated. Should I just bite the bullet and go for dual 3090s for the extra memory pool or is there a way to optimize this? I'm on a tight $1800 budget and need to know what's actually gonna work without lagging...


10

Dude, I totally feel your pain with those OOM errors, v4 Flash is a beast! Honestly, I love the 4090 but if you need that context window for a demo, VRAM is king.

  • Dual NVIDIA GeForce RTX 3090 24GB GDDR6X: 48GB is a lifesaver for context. Used ones fit your budget!
  • Single NVIDIA GeForce RTX 4090 24GB GDDR6X: Fast, but 24GB is a wall. Go dual 3090s, its gonna work like a charm!


Share: