ive been staring at benchmark charts for three hours now and im honestly more confused than when i started lol. i need to get a setup running for a work project deadline next friday so i gotta buy something tonight. im stuck between grabbing a used rtx 3090 for that sweet 24gb vram or just going for a brand new 4080 super since i can pick it up at my local best buy. my logic was the 3090 is better for the 67b model but then i worry about it being a space heater in my tiny apartment and if itll even fit in my mid-tower case. budget is around 1300 bucks. is the extra vram on the older card really worth the risk over the faster 4080 for deepseek?
man i feel your pain lol... those charts are a total rabbit hole. honestly if your main goal is running big models like the deepseek 67b you should absolutely steer clear of that NVIDIA GeForce RTX 4080 Super 16GB GDDR6X. 16gb vram just isnt gonna cut it for a 67b model unless you compress it so much it starts talking nonsense. even with a 4-bit quant you are looking at needing way more than 16gb to avoid hitting a total wall. i would definitely go for a used NVIDIA GeForce RTX 3090 24GB GDDR6X. i have one in my rig and yeah its a bit of a space heater but that 24gb buffer is non-negotiable for local llms. you might want to consider the power draw tho... make sure your psu can handle it cuz those transient spikes are no joke. if you have a cheap power supply itll just shut down your whole pc when the model loads up. check if you have something like an EVGA SuperNOVA 1000 G6 1000W Gold or better before you pull the trigger. also be careful with the dimensions in a mid-tower. some of those triple-fan 3090s are absolute bricks. i had to practically saw a piece off my old case to fit an ASUS ROG Strix GeForce RTX 3090 24GB back in the day. if you can find one for like 700-800 bucks used you could actually buy two eventually... that would actually let you run deepseek properly. dont get distracted by the 4080s speed... vram is king for llms and 16gb is basically a trap for this specific use case.