When API Wrappers Aren’t Enough

When I first started connecting a large language model (LLM) agent to external services, I did the obvious thing: wrap each service’s API and let the agent call it. It sounds straightforward – just give the agent a function for each API endpoint – but in practice, this approach turned out messy and brittle. I quickly realized that simply wrapping APIs isn’t enough. These AI agents are not deterministic computer programs; they’re stochastic word-predictors that can misinterpret instructions, omit fields, or produce slightly off-kilter inputs. A naïve tool that assumes perfect inputs from the agent will frequently break. In other words, building tools for agents is tricky, and requires a mindset shift. We need what I’d call “agent-native” tools – tools designed from the ground up to play nicely with an LLM’s quirks and capabilities.

There’s an apropos analogy from web development: the old MVC guideline of “fat models, skinny controllers.” In MVC, you push as much business logic as possible into the model, keeping the controller lean. This improves maintainability by localizing complexity. Similarly, I propose “fat tools, skinny agents”. Let the tools handle the heavy lifting – the parsing, decision branches, error handling – and keep the agent’s role simple and high-level. The agent should orchestrate what needs to happen, but the tools determine how to get it done. After all, the LLM agent is like a smart orchestrator, not an expert on every API’s idiosyncrasies. We want our agent to be the director, and the tools to be the seasoned crew making the movie magic happen without bothering the director with minutiae.

The Case for Agent-Native Tools

Why not just feed the API docs to the agent and call it a day? The short answer: LLM agents are unpredictable. They might call the right API but format a date wrong, omit a required parameter, or misunderstand an error message. Traditional API wrappers are unforgiving – a single misplaced comma or misnamed field yields a failure. When an LLM agent encounters a failure, it doesn’t automatically know how to fix it (unless we prompt it to). The result is often a stuck agent or a garbled output. Why take something that can be solved accurately and deterministically…and make it probabilistic, error-prone, and unpredictable with an LLM? In other words, if a task can be handled by straightforward logic, we should not force the poor LLM to guess at the correct API usage or data formatting.

Agent-native tools embrace this principle. Instead of expecting the agent to conform 100% to the tool’s interface, these tools meet the agent halfway. They handle the “dirty work” – messy inputs, partial information, ambiguity – using deterministic code or even their own mini AI logic. The best tools are robust against the agent’s stochastic output. If the agent provides a date like “next Monday” instead of "2025-07-28", a smart tool will cheerfully convert that into the precise date itself, rather than throwing an error about an invalid date format. If a required field is missing, a smart tool won’t immediately blow up; it might use a default, attempt to look it up, or at least return a helpful prompt telling the agent what’s needed. The goal is to smooth out the rough edges of the agent’s behavior so that the overall system succeeds more often.

Critically, this doesn’t mean letting the agent do anything and cleaning up after it like a toddler – it means designing tools with clear contracts that are also resilient. The emerging Model Context Protocol (MCP) standard is a step in this direction, formalizing how AI agents discover and invoke tools. Until now, integrating tools with LLMs was “messy,” requiring one-off glue code for each API – an approach that is “hard to maintain and scale.” MCP aims to solve part of that by standardizing tool interfaces for AI, so agents know what tools are available and how to call them. But even with a standard interface, the implementation of each tool still needs to be smart. In fact, the MCP documentation itself urges developers to include robust error handling and validation in every tool (modelcontextprotocol.io). It’s not just about exposing functionality; it’s about making sure the tool can handle the wild and woolly inputs an LLM might throw at it.

Fat Tools, Skinny Agents – What It Means

“Fat tools, skinny agents” means shifting complexity out of the agent and into the tools. The agent should be a skinny layer that decides when to use a tool and interprets the result, but it shouldn’t have to know the nitty-gritty of how to use it. The tools, on the other hand, become “fat” in the sense that they encapsulate more logic: input parsing, multi-step workflows, error recovery, etc.

Think of the agent as a project manager and the tool as a specialist on the team. A good project manager doesn’t micromanage how the database expert does their job – they just say “we need this data updated.” The specialist (tool) figures out the details. If the request is slightly unclear, the specialist can ask for clarification or make an educated guess based on context. In practice, an agent using a fat tool might issue a high-level command, and the tool’s code takes care of executing sub-steps, handling edge cases, and returning the result.

For example, suppose the agent needs to schedule a meeting. A skinny tool might provide a function schedule_meeting(date, time, participants) and expect the agent to fill all parameters perfectly. But a fat tool could simply take a natural-language instruction like “schedule a meeting next week with Alice and Bob” and handle the rest. It might parse “next week” into an actual date range, find a suitable time by checking participants’ calendars, and then call the calendar API. The agent itself doesn’t have to break down those steps – the tool does it. The result: the agent’s prompt logic remains simple (“I should schedule a meeting, I’ll call the schedule_meeting tool with the user’s request”), and the tool handles the complex subroutine of interpreting and executing that request.

Crucially, fat tools also feed results back to the agent in a friendly way. Instead of dumping an inscrutable stack trace or low-level API error, a well-designed tool returns something meaningful or even conversational. In case of an error it can’t fix, it might return a message like, “The meeting couldn’t be scheduled because Alice’s calendar is unavailable. Here a list of available times.” – phrased in a way the agent can relay to the user or use to adjust its plan. This makes the whole system more resilient and user-friendly.

Designing Smarter Tools (So Your Agent Stays Skinny)

How do we build these “fat” tools in practice? Here are some key approaches:

1. Let tools accept natural, “dirty” input: Don’t demand perfectly structured JSON if plain text will do. LLMs excel at producing natural language, so leverage that. For instance, instead of requiring an agent to call updateRecord({"id": 123, "status": "Complete"}), allow updateRecord("Mark task 123 as complete"). The tool can parse that string internally (using regex, a small parser, or even a localized LLM) to extract the id and new status. This way, minor formatting variations from the agent won’t derail the operation. As a design guide puts it, “the more clearly you communicate a tool’s purpose and usage to the model, the more likely it is to use it correctly”. Allowing flexible input is part of communicating that usage in model-friendly terms. The tool’s job is to speak the model’s language, not vice versa.
2. Internalize logic and decisions: A tool can be more than a single API call – it can encapsulate a whole workflow. Consider a “CRUD” scenario (Create, Read, Update, Delete). A traditional approach might expose four separate tools: create_item, get_item, update_item, delete_item. A skinny agent then has to figure out when to use which (e.g. call get_item, check if it exists, then decide to call update_item or create_item). In a fat tool approach, you might have one save_item tool that handles all that. The agent just says “save this record with XYZ data.” The tool can internally check if the item exists (read), then create or update accordingly. This unified tool approach spares the agent from intricate if/then branching in its prompt. In coding terms, we’ve moved the if item exists then ... else ... logic out of the LLM’s head and into deterministic code. We’re effectively giving the agent a Swiss Army knife for that domain, rather than a drawer full of single-purpose knives.
3. Anticipate and handle errors within tools: Don’t blindly pass on exceptions from an API or library. If an external API returns an error (rate limit hit, item not found, etc.), the tool should catch it and decide: can we recover or clarify? Sometimes the tool can automatically retry or use a fallback. Other times, it might translate the error into a helpful response. For example, if an email-sending tool is given an invalid recipient address, the tool can return a message like “The email address looks invalid” instead of a raw 400 error. In LlamaHub Tools library, the developers even handle missing parameters by proactively prompting the agent for them (medium.com). For instance, if an update_draft email tool is called without a draft ID, the tool can notice this and return a gentle nudge like “Which draft did you want to update? (I need an ID)”. This approach is far better than the agent receiving a cryptic null-pointer exception and having to deduce the issue. By handling errors and edge cases, tools make the agent’s life easier – it can focus on what to do next rather than why didn’t that work.
4. Use AI inside the tool (when necessary): This is an advanced strategy – essentially creating a mini-agent or using an LLM within the tool for specialized tasks. Sometimes a tool might need to interpret complex instructions or data that’s hard to parse with simple code. For example, imagine a “summarize and chart data” tool: it might retrieve data from a database and then have to generate a natural-language summary. Instead of making the main agent do multiple steps (retrieve, then summarize, then chart), the tool itself could call an internal LLM to produce the summary, then generate a chart image, and return the results. From the main agent’s perspective, it was one tool call that accomplished a high-level task. Essentially, the tool contains an embedded workflow (even an embedded LLM) that isn’t exposed as multiple external actions. This should be used judiciously – you don’t want to turn every tool call into a hidden multi-LLM operation needlessly – but for certain complex workflows it can simplify the top-level agent logic dramatically. It’s akin to having a specialist on call: your agent delegates a sub-problem to a smaller AI or script that’s dedicated to that job. Just be sure to also handle the failure modes of that internal AI (just as you would any other function).
5. Keep tools transparent and well-documented: Ironically, making tools “fat” with internal logic doesn’t mean they should be black boxes. An agent (or the developer building prompts for it) still needs to know in general what a tool does and when to use it. Clear naming and documentation are key. If a tool is high-level (like our save_item or “schedule_meeting” examples), its description should reflect that broader capability. This way the agent won’t hesitate to use it for an appropriate task. In the Model Context Protocol world, this equates to providing a good natural-language description and examples for each tool, and using the JSON schema to define inputs where applicable (modelcontextprotocol.io). The agent doesn’t actually read the JSON schema like a human would, but these schemas and descriptions help constrain and guide the model’s output when it’s formulating a tool call. In short, make the tool intuitive – if it’s doing more under the hood, communicate that at a high level so the agent knows it can rely on the tool for that domain of problems.

Example: Rethinking a Notion Integration with Fat Tools

Let’s bring these ideas to life with a concrete example. Notion, the popular workspace app, has an API and recently an MCP integration that exposes a suite of tools. If you look at Notion’s official MCP tools, there’s a separate tool for each action: search the workspace, fetch a page, create-pages, update-page, move-pages, etc (developers.notion.com). These are powerful building blocks – in fact, the documentation proudly says “with a single prompt, you can search your workspace, create new pages from the results, and update properties… by combining multiple tools.”. That’s impressive, but also highlights the burden on the agent: a single complex user request might require the agent to orchestrate three or four tool calls in sequence. Each of those calls needs the correct parameters (often IDs or URLs of pages, property names, etc.), which the agent has to extract and keep track of. In practice, making all those calls reliably in the right order can be quite hard for an LLM agent. It’s easy to see how things could go wrong – maybe the agent searches and finds two results and isn’t sure which one to fetch, or it tries to update a page that doesn’t exist yet because it forgot to create it first, and so on. Using the raw toolbox requires very careful promptcraft and a fair bit of luck.

Now imagine a “fat tool” approach to the same problem. Instead of exposing five narrowly-scoped Notion tools and asking the agent to be a master craftsman with them, we could provide one NotionAssistant tool that wraps common workflows. For example, an agent could call NotionAssistant("Add a section about risks to the 'Project Plan' page"). Under the hood, the tool would perform the steps: search for the “Project Plan” page, if found fetch its content, then update the page by adding the section text. If the page isn’t found, perhaps the tool could create a new page with that title and then add the section. All of that logic stays inside the tool implementation. The agent’s job is simply to express the user’s intent (“add this section to that page”) and let the tool worry about the details.

With this approach, the agent is far less likely to get tripped up on intermediate steps because there are no intermediate steps exposed to it – it’s one tool call. The tool can even handle ambiguous cases: if the search finds multiple “Project Plan” pages, maybe it picks the most relevant or asks the agent (via a clarifying message) which one to use. If the Notion API returns an error (say the user doesn’t have permission to edit that page), the tool can catch it and inform the agent in plain language. The net effect is a more reliable agent that still accomplishes the same end result, but with a lot less prompt gymnastics.

To be clear, this doesn’t make the underlying Notion APIs go away – it just wraps them in a smarter layer. We’ve essentially built an agent-native Notion tool that aligns with how an LLM thinks (“I want to do X with that page”) rather than forcing the LLM to become a step-by-step Notion API expert. This is exactly the kind of enhancement that turns an okay agent into a great one. It’s also a strategy endorsed by best practices: tools should focus on use-cases, not just raw technical actions. By combining “search, then update” into one use-case-driven tool, we’ve matched the tool to the user’s intent, which is ultimately what the agent is trying to fulfill.

Tools Matter – Let’s Build Better Ones

In the rush to build ever smarter AI agents, it’s easy to overlook the humble tools we give them. We assume if the agent is super-intelligent, it can figure out how to use any tool we throw at it. Reality proves otherwise – the agent is only as effective as the tools it can reliably use. Every failure I’ve seen in an agent system (aside from pure hallucination) ultimately came down to a tool mismatch or misuse. That’s why “fat tools, skinny agents” is more than a catchy phrase; it’s a design philosophy that can drastically improve an agent’s performance in real-world tasks.

By investing time to build smarter, more resilient tools, we make our agents simpler, safer, and more capable. It’s the classic software engineering trade-off: spend effort to build robust abstractions (in this case, agent-ready tools) so that the higher-level logic (the agent’s reasoning) can be clean and straightforward.

if you’re building an AI agent or the tools it uses, give “fat tools, skinny agents” a try. Don’t hesitate to add that extra logic in the tool, or unify a couple of steps, or include a parsing module – you’ll likely find your agent becomes more robust overnight. Encourage your tools to be forgiving of the agent’s creativity (or occasional sloppiness). And if you’re adopting frameworks or standards, look for those that embrace this philosophy. Let’s stop expecting our AI agents to do all the tedious determinism that our regular code can handle better. Free up the agent to focus on high-level reasoning, and let the tools do the heavy lifting in the trenches.

In the end, building AI systems is still software engineering – just with a funky new component (LLMs) in the mix. The old wisdom of abstraction and separation of concerns still applies. We wouldn’t build a web app with a massive monolithic servlet doing everything; we shouldn’t build an AI agent that tries to handle every low-level API call decision itself. Fat tools, skinny agents. Your agent will thank you, your users will get more reliable results, and you’ll have a saner time debugging it all. It’s a win-win – and as this approach becomes more common, we’ll all trust our AI co-workers a bit more to get the job done right. Now go forth and build some awesome fat tools!