Localizing design issues with LLM agents
Most tooling for code quality is good at detection — it tells you a smell or a design problem exists. The harder, more useful question is localization: where in the code does the problem actually live, and what should I change?
Design issues make this especially tricky. Unlike a null-pointer bug, a poor design decision is diffuse: it’s spread across several files, it depends on intent, and the “right” answer often requires understanding why the code is shaped the way it is. That context is exactly what classical static analysis throws away.
Grounding reasoning in program facts
The approach I’ve been working on treats this as a job for an agent that can do two things at once:
- Look at concrete program facts — structure, dependencies, call relationships — the way a static analysis would.
- Reason in natural language over those facts, the way a senior engineer reads a diff and says “this responsibility doesn’t belong here.”
Neither half is enough alone. Pure analysis can’t weigh intent; a pure LLM hallucinates about code it can’t actually inspect. Putting the analysis in the loop keeps the agent honest, and lets it narrow a vague “something’s off in this module” down to the specific locations a developer should look at first.
Why localization is the unlock
Localization is the bottleneck for everything downstream. If an agent can reliably say where the problem is, then automated refactoring, design repair, and review assistance all become tractable — they finally have a target to act on.
This is the idea behind LocalizeAgent; the full method and evaluation are in the paper on my publications page. I’ll use future posts to dig into specific design decisions — how much analysis context to feed the agent, and how to keep its conclusions verifiable.