Building Blocks of Coding Agents

Someone at work recently shared this workshop and repo by Geoffrey Huntley. I set a reminder to walk through the workshop when I had some time over a weekend.

When I finally sat down with it, I was surprised to find the instructions easy to follow in around 30 minutes. It turns out that coding agents are pretty simple and — at their core — they’re just LLMs running in a loop. They share a few core tools as common building blocks, and differentiate themselves in other ways.

This framing really helped demystify coding agents for me. I wanted to write down some notes on these core building blocks, as well as some reflections on how products like Cursor actually differentiate themselves.

The foundational tools

Every coding agent starts with the same handful of basics:

Reading tool: The most fundamental piece: a tool to open and read files. This tool lets the model inspect source code, configs, or documentation in addition to whatever prompt you gave it.
List-files tool: This tool knows how to map the files in your project. It returns file structures in JSON for easy parsing by the LLM, understands it should skip hidden files, etc. It can plan where to navigate next.
Bash tool: This tool has Bash access; the agent can compile code, run unit tests, check versions, or automate routine shell tasks. This allows the LLM to actually get stuff done. It can catch failures, react to them, and feed that information back in.
Code search tool: Provides the ability to grep your project. Geoffrey claims that most coding agents use the open source ripgrep library for searching through files.

The workshop shows you can build a basic coding agent pretty quickly with these four tools. To be fair, the workshop is actually about running a pre-built coding agent, but it does a great job of illustrating how simple one can be.

How do coding agents differentiate?

Going through the workshop left me asking a few questions. How do products like Cursor, Claude, Windsurf, etc. differentiate themselves? Why might developers choose one over the other? Will any of these tools “win” in this market?

After some reflection and research, for me the simplest answer is that they don’t differentiate all that much. I came away thinking that coding agents are more like IDEs; they offer the same core functionality, with minor differentiation. The reason users may choose one over the other may simply boil down to personal preference.

In terms of the minor aspects of differentiation, here’s what I think the key dimensions are:

Assistant vs. Agentic: A key differentiation is whether the tool focuses on assisting the developer with autocompletes (Copilot is a good example here) or does it actually offer the ability to write to multiple files, run tests, and actually act like an agent (eg Claude Code or Cursor)?
Integration Depth: Extensions like Continue.dev add AI into existing IDEs. Cursor is a VS Code fork that can give more control to the LLM. AI-native IDEs like Windsurf rebuild their own experience around agents.
Memory & Persistence: Most tools are stateless, but newer ones experiment with session or long-term memory so the agent remembers your project across days, not just each prompt.
Safety & Guardrails: Agentic tools differ in how much autonomy they allow: some always show diffs and ask approval, while others have “YOLO” modes that apply and verify changes automatically.

I can understand if some developers see these as larger features (not minor points of differentiation) but to me they are simply a matter of preference that boil down to “this is how I want to pair with an LLM to write code”.

For me, sometimes the tool is different depending on the task. Regardless of my assessment here, if coding agents feel like a black box to you I highly recommend going through the workshop and reading more of Geoffrey’s writing!

The foundational tools

How do coding agents differentiate?

Share this post