Seriously im about to lose my mind with AWS right now. I spent the last three days trying to get DeepSeek-V3 running on an EC2 instance and its just been one nightmare after another with g5 instances being unavailable in my region and the documentation being a total mess. Like why is it so hard to just get some decent GPUs without paying a kidney every hour?? I have this project for a client in Germany that needs to be live by next Friday and I am nowhere near ready because I keep hitting these weird resource limits and capacity issues every time I try to scale.
I tried moving over to a smaller local provider but their bandwidth is garbage and the latency makes the model feel like its running on a toaster. Honestly just fed up with the big players and their hidden fees and complicated setups that require a PhD just to launch a simple container. I need something that wont break the bank—im trying to keep the monthly bill under 400 bucks—but actually has the juice to run DeepSeek without lagging out every five seconds.
Is there any cloud provider that actually makes hosting DeepSeek easy and affordable or am I just chasing a dream here? Like who are you guys using that doesnt make you want to throw your monitor out the window...
AWS capacity is a total joke, honestly. I agree that the big players make it way harder than it needs to be. You might want to consider Lambda Labs GPU Cloud A100 80GB instead, but be careful because they run out of stock constantly. DeepSeek needs serious VRAM bandwidth, so I would suggest looking at RunPod On-Demand GPU Instances RTX 6000 Ada for better price-to-performance than those overpriced g5 instances.
Building on the earlier suggestion, i would be extremely cautious about that 400 dollar budget because DeepSeek-V3 is a massive resource hog. Honestly, trying to run the full model 24/7 for that price is basically impossible unless youre using some very heavy quantization. You might want to consider these two options instead of wasting more time on AWS:
late to the party but honestly i think you are making things way too difficult for yourself with that budget. @Reply #3 - good point! definitely worth saving these threads for later but i gotta disagree with the crowd here about renting raw boxes. If you want reliability for a client you should just go with DeepInfra, you cant go wrong. You wont hit those weird capacity limits and you only pay for the tokens you actually use. Its way more stable than hoping an instance doesnt get killed in the middle of a demo or dealing with those g5 shortages on AWS. Or honestly just get any managed setup from Together AI. Its much better for your sanity because you arent the one babysitting the servers when they go down at 2am. For under 400 bucks its basically the only way to keep the project on track without losing your mind...
Bookmarked, thanks!
Just wanted to say thanks for everyone chiming in. Super helpful discussion.