Which AI is actually handling complex Python scripts best right now? Ive been devving for years but these new async microservices Im building for a fintech client in London are hitting a wall with my usual workflow. I need this prototype done by Friday but GPT-4o keeps looping on logic and hallucinating local dependencies...
just saw this thread and wanted to mention that if youre worried about the costs of those big API calls while hitting a deadline, you might want to consider some of the open-weight models. last month i was in a similar spot with a client project where the API costs for the big names were just eating my margin. i ended up using DeepSeek Coder V2 236B via an API provider that charges way less. honestly it was a lifesaver for complex async stuff. you gotta be careful with the security side tho... especially with fintech data. i always make sure to scrub my prompts of any sensitive info before sending them off. if youre really pinching pennies, maybe look into Mistral Large 2 123B too. it is surprisingly robust for logic heavy python without the usual gpt overhead. just be sure to double check the logic, they all trip up eventually. let me know if you need any help with the setup!
Caught this thread a bit late. Re: "> GPT-4o keeps looping on logic and hallucinating..." - honestly, I feel that pain. Been building microservices for a minute and GPT has been dropping the ball on complex dependency chains lately. Before I dive in, are you using a specific library like Tortoise ORM or just standard asyncpg? That context usually changes which AI handles the syntax better. I've tried a bunch of these over the years and here is what actually worked for me recently:
Facts.
Bump - same question here
> GPT-4o keeps looping on logic and hallucinating local dependencies... Ive dealt with this on fintech backends before. For complex Python async logic, Anthropic Claude 3.5 Sonnet is the best bet right now. It handles reasoning way better than GPT-4o. If you want to save some cash while hitting that Friday deadline, dont pay for a full monthly sub. Use it via OpenRouter API instead. You only pay for the tokens you actually use, which usually ends up being way cheaper for a quick prototype sprint. Claude is much less likely to hallucinate local imports compared to OpenAI models. If you need a solid editor, Cursor Code Editor Pro has a trial that might get you through the week for free. It lets you swap between models easily so you can test which one handles your specific microservice architecture best without wasting money.
> I need this prototype done by Friday but GPT-4o keeps looping on logic and hallucinating local dependencies... Late to the party but I would suggest being careful about just jumping into a new model when youre on such a tight deadline. I was in a similar spot with a fintech project a few months back and switching models mid-sprint actually made things worse because I spent all my time fixing new types of bugs. Honestly, there is a really good breakdown of how the different brands compare for async logic on YouTube. Just search for something like best AI for python coding comparison 2024 and it should be the first result. I saw a 20 minute video about it recently that was way more detailed than anything we could post here. Its worth checking that out or even just looking at the latest mega-threads on Reddit before you commit to a new workflow... it might save you more time than just trial and error.
Saved for later, ty!
To add to the point above: I have been stuck in that exact same logic loop for days on a project very similar to yours... it is truly frustrating when the AI starts hallucinating local imports that simply do not exist. Im building a similar async stack right now and was about to pull my hair out before switching tactics. I ended up moving away from the main players for my heavy lifting and I am honestly very satisfied with DeepSeek Coder V2 236B API lately. It is a beast for complex Python reasoning and has been working well for my fintech scripts without the usual loops. No complaints so far.