From Research Paper to Working Toolkit in One Session

There’s a graveyard somewhere. Every engineering team has one.

It’s full of documentation initiatives. Strategic frameworks. Best practices that lived in slide decks but never made it into the codebase. Research papers bookmarked with good intentions.

I had just added to the pile. A 46-citation research paper titled “Autonomous Knowledge Synthesis: A Strategic Framework for Comprehensive Codebase Documentation Using Claude Code.”

The paper was thorough. Seven documentation layers. The Diátaxis framework for user docs. Context engineering strategies. Prompt architectures for everything from architecture analysis to technical debt audits.

It was also, functionally, shelf-ware. An artifact that described how documentation should work without being something anyone could actually use.

The Gap

The research laid out a clean theory:

Claude Code can autonomously analyze codebases using an OODA loop
Context windows need careful management via incremental ingestion
CLAUDE.md files anchor agent behavior across sessions
Different documentation layers require different personas and prompts
Custom skills turn ad-hoc prompting into reproducible workflows

All true. All validated by sources. All useless without implementation.

The gap between “here’s the framework” and “here’s how to use it on your project Monday morning” is where most documentation initiatives die. The research describes the destination. It doesn’t provide transportation.

The Experiment

Instead of letting the paper join the graveyard, I tried something different. I gave the research to Claude and asked a simple question:

How do we turn this into a reusable approach for any existing project?

Not “summarize this paper.” Not “explain the key concepts.” A direct request to operationalize theory into working code.

What followed was a two-hour session that produced a complete toolkit: 9 custom skills, a scaffold generator, templates, and a public repository. The paper became software.

What the Agent Did

The agent’s first move wasn’t to start writing code. It was to synthesize.

It read the 46-citation paper and extracted the operational patterns—the parts that could become executable. The OODA loop description became a detection protocol. The seven documentation layers became seven corresponding skills. The Diátaxis quadrants became a parameterized user documentation generator.

Then it proposed two implementation paths:

Approach	Description	Tradeoff
Scaffold Generator	Single `/init-docs` command bootstraps everything	All-or-nothing adoption
Modular Toolkit	Separate skills adopted incrementally	More flexible, slower to full value

I chose the scaffold generator. The agent started building.

The Scaffold Generator

The core skill—/init-docs—does four things:

Detects project type and tech stack from config files
Generates a project-specific CLAUDE.md configuration
Creates a complete docs/ directory structure
Installs all documentation skills for ongoing use

The detection phase reads package.json, go.mod, pyproject.toml, Dockerfiles, CI configs—whatever exists—and infers the stack. This goes into CLAUDE.md so subsequent skills don’t need re-prompting.

The structure phase creates the skeleton:

docs/
├── architecture/    # System design, ADRs, diagrams
├── developer/       # Onboarding, setup, contributing
├── ops/            # Infrastructure, CI/CD, deployment
├── testing/        # Strategy, coverage, patterns
├── functional/     # Business logic extraction
├── strategic/      # Tech debt, roadmap
└── user/           # Tutorials, guides, reference
    └── (Diátaxis quadrants)

Each subdirectory gets placeholder READMEs that point to the skill that populates them.

The Layer Skills

Each of the seven documentation layers became a dedicated skill:

Skill	Persona	Output
`/doc-architecture`	Principal Architect	Patterns, decisions, Mermaid diagrams
`/doc-developer`	DevEx Engineer	Setup, onboarding, contribution guides
`/doc-ops`	SRE	Infrastructure topology, CI/CD, runbooks
`/doc-testing`	QA Lead	Test strategy, coverage gaps, patterns
`/doc-functional`	Business Analyst	Business rules in stakeholder language
`/doc-strategic`	CTO	Tech debt audit, remediation roadmap
`/doc-user`	Technical Writer	Diátaxis-compliant user documentation

The persona assignment isn’t decorative. It shapes output. When /doc-ops runs as a “Cloud Security Auditor,” it surfaces risk and access patterns. When /doc-functional runs as a “Business Analyst,” it translates code into policy language without mentioning arrays or iteration.

The Audit Skill

The eighth skill—/doc-audit—closes the loop.

It compares existing documentation against the codebase to identify:

Missing documentation: Components with no corresponding docs
Outdated documentation: Docs older than their source files
Incomplete documentation: Placeholder content never populated

The output is a coverage matrix with prioritized recommendations. “Run /doc-architecture first—no system overview exists.” “The payment module was modified last week but docs/functional/payments.md is six months old.”

This turns documentation from a one-time project into an ongoing audit cycle.

What Made It Work

Research as specification, not inspiration. The paper wasn’t vague guidance. It included specific prompt architectures, structural patterns, and quality criteria. The agent had enough detail to implement, not just enough to understand.

Persona-driven generation. Each skill embeds a professional persona that shapes the output. The agent doesn’t just “document the tests”—it analyzes them as a QA Lead would, identifying coverage gaps and strategic risks.

Bounded scope per skill. Instead of one massive “document everything” command, each skill handles one layer. This keeps context focused and outputs coherent. You can run /doc-architecture without generating developer docs.

The CLAUDE.md anchor. Project-specific configuration persists across sessions. The agent doesn’t need re-prompting about tech stack, exclusion directories, or documentation standards. It reads the config and adapts.

The Division of Labor

What I did:

Provided the research as input material
Chose scaffold generator over modular toolkit
Named the repository
Requested documentation of the research itself

What the agent did:

Synthesized operational patterns from academic framework
Designed the skill architecture
Wrote all 9 skill files (~3,000 lines of prompt engineering)
Created directory structures and templates
Initialized git, created GitHub repo, pushed code
Formatted the research paper as repository documentation

Total time: About two hours. Most of that was generation and my review of outputs.

The research paper described what should happen. The agent made it happen.

The Output

The toolkit is public: github.com/nino-chavez/claude-docs-toolkit

To use it on any project:

# Install the skills
cp -r claude-docs-toolkit/.claude /path/to/your/project/

# Bootstrap documentation infrastructure
claude /init-docs

# Generate what you need
claude /doc-architecture
claude /doc-developer
claude /doc-audit

The /init-docs command auto-detects your stack. The individual skills generate their respective layers. The audit skill tells you what’s missing.

What Didn’t Work

The first pass at /doc-user was too generic. Diátaxis has four quadrants—tutorials, guides, reference, explanation—and the initial skill tried to handle all four in one invocation. The output was muddy.

The fix was parameterization: /doc-user type=tutorial feature=onboarding generates one specific document in one specific quadrant. Bounded scope again.

I also initially forgot to include the research paper in the repository. Documentation about documentation tooling should probably include its theoretical basis. The agent added a docs/research/ directory with the full paper after I mentioned the gap.

The Meta Layer

There’s something recursive about using an AI agent to implement a framework for AI-assisted documentation.

The paper describes how Claude Code can analyze codebases and generate documentation through an OODA loop. The session that produced the toolkit was itself an OODA loop: the agent observed the research, oriented around operational patterns, decided on the scaffold approach, and acted by writing code.

The framework validated itself in its own implementation.

What’s Next

The toolkit exists. It works on any project with detectable config files. But it’s a starting point.

What I’m watching:

Drift detection in CI. The audit skill could run on every PR, flagging when code changes outpace documentation.
Cross-project patterns. Running this on multiple codebases might surface common architectural patterns worth extracting.
Voice calibration per project. Right now, documentation voice is generic. Projects have personality. The skills could learn it.

The research paper described a destination: continuous documentation synchronized with the source of truth. The toolkit is transportation. Whether it gets there depends on what happens after the initial scaffold.

This is the fifth entry in the Agentic Workflows in Practice series. Not demos. Not theory. Real work, documented as it happens.