I've been going down a rabbit hole trying to find a decent GPU host for a recommendation engine I'm building for a UK-based shop. My logic was to start with Lambda Labs since their training costs are low, but I'm worried about the actual deployment side for 24/7 uptime. I saw some folks mention RunPod for its flexibility, but then I read about issues with cold starts if I go serverless.
My budget is strictly under $400 a month and I need to be live by early August. I'm just kinda stuck deciding if I should split the training and hosting across different providers or if there's one spot that handles both without the insane complexity of AWS. What do people actually recommend for this middle-ground scale?
Late to the party, but i went through this loop last year. I found that Paperspace Core NVIDIA RTX A4000 16GB instances were way more stable for 24/7 inference compared to serverless.
I've been quite satisfied with Vultr for my production workloads lately. Their predictable billing makes it much easier to stay under that 400 dollar mark without surprises.
I've tried many setups over the years and your logic about splitting the workload is totally spot on. In my experience, trying to find one perfect spot usually leads to overpaying for at least one stage of the pipeline...