Inside Popsa

How we automated our technical documentation with agentic AI

What looks simple to customers hides complexity under the surface. As teams move fast, technical documentation can quickly fall out of date, so we created an approach that maintains itself

Tom Cohen

12 Feb, 2026∙11 min

How we automated our technical documentation with agentic AI

Article at a glance

Documentation decays because software changes non-linearly, not because teams stop caring.
Treating docs as infrastructure, not a task, keeps them alive for both humans and AI agents.
Agentic AI can handle the repetitive maintenance, leaving engineers to review, steer and build.

Most people take thousands of photos a year and never look at the majority of them. Popsa helps those people find the ones that matter and turns them into something tangible they’ll actually keep and look back on time and time again.

Our mobile apps run machine learning on customers’ devices to analyse photos (without uploading them to the cloud); pick out the people and moments that mean something; and design a layout that tells the story. From the outside it looks simple – and that’s the point.

Behind that simplicity is a lot of machinery: iOS, Android and web apps, backend services handling payments and order fulfilment, infrastructure managing image processing and delivery across 50 countries. These systems hold people’s most personal photos and process their payments. They need to stay reliable and secure, and the team’s understanding of how everything connects can’t be allowed to go stale.

That understanding lives in documentation that explains how the systems connect, what the conventions are and where the sensitive boundaries sit. It’s what lets a team maintain, secure and extend complex systems, without depending on one person who happened to build each piece. Keeping that documentation alive, though, is a problem every engineering team has struggled with at one time or another – including ours.

I’ve been on enough engineering teams to know the cycle. One of us decides that “this time, the docs are going to be properly maintained”. There’s a burst of effort. Pages get created, conventions recorded, architecture diagrams drawn. Sometimes, teams switch entire systems (GitHub, Confluence, Coda), creating fresh momentum. For a few weeks, maybe a couple of months, things look great.

Then it rots.

Not because anyone stopped caring but because software change isn’t linear. You can absorb tens of incremental changes without the docs drifting much, but then you refactor the authentication layer, or migrate to a new framework, or restructure an API. The documentation was right when it was written, but the codebase moved on. For example, we’ve just refactored our iOS app from a linear navigation flow into a tab-based flow. After a change that large, whole sections of documentation can describe a system that no longer exists. Documentation can easily drift.

Most engineers know this can be a problem, but recently our team thought about how we could solve it by thinking of documentation as a system, rather than a task.

The gap that was never closed

People have tried to solve this before, of course. Tools like Javadoc, Swagger and Doxygen generate documentation directly from code. They’re useful for what they do – API references, function signatures, endpoint schema – but the output is procedural and dry. It tells you what a function takes and returns, not why the system is designed that way, what the conventions are or how the pieces fit together. The documentation that rots is the human-written stuff – architecture overviews, convention docs. That’s the gap that was never closed.

For years, the effort of building something better would have outweighed the effort of updating the docs by hand, but AI has changed that equation. The gap between “I wish this maintained itself” and “I can actually build something that does” has closed.

Our team operates quickly, efficiently exchanging information, onboarding new engineers without bottlenecks, keeping everyone aligned on how the systems work. High-level documentation is what makes that possible. When we looked at ours recently, the picture was familiar to any engineering team: some of it was good, some of it described codebases from months ago, and in places there were gaps where documentation had never existed at all – the knowledge just lived in people’s heads.

Now, like many teams, we are using more and more AI coding agents across our repositories, and those agents need the same thing. With high-level documentation we can onboard agents on to the team the same way we onboard engineers: here’s how the systems connect, here are the conventions, here’s what matters. But audiences, humans and machines can be let down by the same problem – documentation that was written once and never updated.

Humans and machines

That second audience (AI agents) is what recently changed how we thought about all of this. Documentation used to be something we maintained for human convenience; a nice-to-have that teams feel guilty about neglecting. Once AI agents entered the picture, it became infrastructure.

As we found ourselves repeating conventions in multiple places, we decided to centralise this knowledge. Now, we are rolling out files across all of our main repositories (AGENTS.md, CLAUDE.md, .cursorrules), each generated from templates, each containing a header that points back to our central documentation repository.

When an AI agent starts working on any Popsa codebase, the first thing it sees is a pointer to our living documentation. The conventions, the patterns, the glossary, the architectural decisions, and – critically – the product knowledge. Without it, how would an agent know that a PBK-7 is a Medium Landscape Hardback Photo Book or that TIL-2b is a Black Photo Tile with a Gloss Finish?

This creates a really nice cycle:

engineers do the work
agents reviewing that work consume the documentation to give better, more domain-aware feedback
that feedback triggers code changes
those code changes open documentation update PRs, ready to review and merge.

The updated documentation is then available for the next person or agent that picks up a task. The system feeds itself and it starts to compound across repositories. As product names, feature flags and analytics events get surfaced into central documentation, that knowledge becomes available to agents working in entirely different codebases.

The Android team ships a new feature. When an agent reviews the equivalent iOS pull request weeks later, it has enough context to spot that the feature name doesn’t match, or the analytics events use different conventions. Central docs become a shared memory across the whole organisation, not just a reference for the repo they were written about.

There’s a cost angle, too. Without good documentation, an agent starting work on a repository has to discover the codebase from scratch: searching files, reading source code, grepping for patterns. That exploration burns tokens and time on every single interaction. With a curated, current doc, the agent reads one file and gets the full picture. Keeping docs accurate reduces the per-interaction cost of every agent working across your organisation.

How it works

We built a pipeline. The concept is straightforward, even if the details required careful thought.

When an engineer merges code to our primary codebases (backend services; web, iOS or Android apps; IaC repo), a lightweight GitHub Action fires. It collects the changed file paths and commit reference, and dispatches a repository_dispatch event to our central documentation repository.

On the receiving end, another workflow picks up the event and does something quite targeted:

The key is that it doesn’t regenerate everything. A YAML configuration file maps each source repository to the specific documentation files it can affect, along with glob patterns defining which file paths to watch. A change to Terraform modules won’t trigger an update to the Go standards doc, just as a change to Swift files won’t touch the TypeScript conventions. Only the documentation that’s actually relevant to the code change gets examined.

The workflow spins up Claude Code with a carefully constructed prompt. Claude reads the existing documentation, explores the source repository at the exact commit that triggered the update, compares what the docs say against what the code actually does, and proposes changes. Crucially, it can also report that no changes are needed. The prompt itself is deliberate about what Claude should and shouldn’t do: preserve the existing document structure, use British English, include file path references when citing examples, don’t remove content that’s still accurate, don’t add speculative content. We ask for “surgical edits” rather than “not rewrites”.

A few things about how this is put together. The docs repo doesn’t poll or schedule. Source repos push events when changes happen, which means any new repository can participate just by adding a small notification workflow and an entry in the config file. Only one documentation update per source repository runs at a time. If two PRs merge in quick succession, the second update waits for the first to finish rather than being discarded, and Claude can output NO_CHANGES_NEEDED and the pipeline simply closes cleanly. Not every code change affects the documented conventions, and the system recognises that gracefully.

One detail we particularly like is that during development, Cursor’s Bugbot (an AI code reviewer) caught six bugs in the initial implementation of the pipeline. An AI reviewing the code that would be used to run another AI. Turtles all the way down.

We’ve open-sourced the pipeline so you can adapt it for your own repositories.

Review, don’t rubber-stamp

Although we get excited about anything that helps us move faster, we monitor and test these things carefully. The auto-docs workflow opens a pull request, meaning engineers can review it, check whether the proposed changes actually reflect what happened in the code, assess verisimilitude – “Is this what the code really does now?” Our team can approve, request changes or close it entirely. The PR review step keeps human engineers steering – we stay aware of how the documentation is evolving, keeping us familiar with the conventions. We maintain ownership of the knowledge base even though we’re not doing the grunt work of maintaining it.

In the AI age, this is how we’re increasingly thinking about the division of labour:

Building the system is the engineer’s job, reviewing the output is the engineer’s job. The repetitive middle bit is the AI’s job. Reading through a diff of source code changes, working out which documentation sections are affected, writing the updated paragraphs, running the linter, opening the PR – that’s the “niggly work” and the AI is now genuinely good at it.

The pattern underneath

Although we built this to improve documentation quality, we stumbled into a pattern extending the whole category of engineering work. Think about the tasks that every team agrees matter but nobody wants to own:

Keeping test coverage meaningful as code evolves.
Maintaining changelog entries that actually describe what changed.
Validating that API contracts still match their implementations.
Auditing dependency versions.
Updating architecture decision records.

These all share the same structure – there’s a current state and a desired state, and the work is comparing the two and proposing updates. Read the current state, read the source of truth, diff them, produce a targeted change for human review. That’s a pattern AI handles well, and the cost is near zero (after setup).

This brings it back to where we started. People trust Popsa with their most personal photos, and their memories of the people and places that matter to them. The systems behind that experience need to be understood, maintained and kept secure, not just by the people who built them, but by everyone who works on them next. Documentation is what makes that possible, and for the first time we have a way to keep it alive without it becoming someone’s second job.

The result is a team that spends less time maintaining knowledge and more time improving the experience: better curation, better privacy, better products. Engineers focus on the work that actually matters to people, not the upkeep underneath.

That’s what the system was for. Not the documentation itself, but what it frees us to do – build something worthy of the memories people trust us with.