Building an AI-Driven E2E Testing Framework from Scratch

Shipping used to mean stress.

Especially when you’re the only person standing between the deploy button and a swarm of live users. But what if you could guarantee full regression coverage—with zero human oversight?

I just shipped a production-grade, autonomous end-to-end testing framework that touches every admin feature in my app, from the browser’s perspective, using nothing but AI tools and system-level architecture.

What I Built

I didn’t just write tests—I built a self-sustaining test architecture powered by Kilo AI that:

Writes its own Playwright tests
Heals itself when the UI or schema changes
Executes multiple test tiers automatically (smoke → full regression)
Validates every button, input, route, and display from a real user’s point of view

This isn’t about code coverage percentages. It’s about real confidence in every release.

How It Works

AI templates for test generation—Kilo analyzes components and writes E2E tests using reusable prompt blueprints. Self-healing maintenance—tests detect schema or UI changes and auto-regenerate with updated selectors and assertions. Zero-touch execution—with one command, the system spins up the environment, runs all test suites, and outputs full reports.

Under the hood: modular utilities (test-ids.ts with 318 IDs for stable selectors, TestModeContext for visual test states, admin-page-validators.ts with 378 lines of reusable logic). All admin routes modeled and abstracted for DRY test writing. Coverage spans functional, visual, accessibility, performance, cross-browser, and mobile.

The Shift

I used to treat QA as a cost center—something I had to do before shipping. Now I think about it differently. QA can be a strategic asset—powered by AI, maintained by code, enforced by design.

The framework is fully autonomous, AI-maintained, fast, reliable, and designed for scale. And most importantly, it’s running right now—in production—guarding every admin feature.

What I’m Still Figuring Out

The self-healing works for incremental changes, but big refactors still break things in ways the AI doesn’t anticipate. I’m experimenting with having the AI analyze git diffs before regenerating tests, so it understands the scope of what changed.

And there’s a trust question I haven’t fully answered: how do I know the AI-generated tests are actually testing what matters? I’ve added assertions to verify the tests themselves, which feels recursive in an uncomfortable way. But it’s working for now.

Building an AI-Driven E2E Testing Framework from Scratch

Originally Published on LinkedIn

More in AI & Automation

From Research Paper to Working Toolkit in One Session

Spec-Driven Development with Multi-Agent Orchestration

The Seven Stages of AI Adoption