How can I improve an agent's multi-step reasoning abilities?

Question

How can I actually improve an agent's multi-step reasoning abilities so it stops getting stuck in these infinite loops? Honestly im just so fed up with the current performance of my LLM agent because it starts out great and then just completely falls apart after the second or third step. Im building this custom assistant for my small vintage clothing business in Bristol—trying to automate the whole process of identifying an item from an upload, checking it against my current spreadsheet, and then drafting a listing—and its been a total nightmare. I have a hard deadline to get this live by the end of the month and I am just burning through my 200 dollar monthly API budget way too fast because the agent keeps re-running the same search tool over and over again like it forgot what it just found.

Its so frustrating because when it works it is brilliant and I get so excited thinking about all the time I'll save but then it just loses the thread. I tried using LangChain but it feels like it adds so much overhead and the reasoning still feels super flimsy when things get complex. Like it will identify the shirt correctly but then when it goes to check the price it somehow forgets which shirt it was looking at? Ive tried breaking it down into smaller tasks but then passing the context between those steps becomes another headache entirely.

Is there a better way to structure the prompt or maybe a specific way to handle the memory so it actually sticks to the plan? I really want this to work because it would be a game changer for my workflow but right now I feel like I'm just throwing money into a void while the agent chases its own tail. Maybe I should try a different model or is there a specific logic gate I should be using? I've seen people talk about chain of thought stuff but implementing it in a way that doesn't just make the agent ramble is proving to be way harder than I thought...

RobertHoish · Accepted Answer

Re: "I love Anthropic Claude 3.5 Sonnet API for..."

Unfortunately, Sonnet 3.5 is not as good as expected for tight budgets. I had issues with high input token costs during these loops.

Try Groq Cloud LPU Inference with Meta Llama 3.1 70B.

It has lower cost per 1M tokens for high-frequency calling. Basically, what specific state management library are you using for your logic?

nfmgfilofk · Answer

I love Anthropic Claude 3.5 Sonnet API for logic, its amazing! Swapping from OpenAI GPT-4o API stopped my inventory loops. Claude is slightly slower but way more reliable for reasoning!

yrfpzduqzp · Answer

actually i kind of disagree about just swapping models... that usually just masks the problem until your budget runs out again. i would suggest being more cautious with the logic flow itself. i saw a really good video on youtube about building state machines for llm agents that fixed this exact thing. honestly just search for agent loop architecture on there and it should be one of the top results.