Back to Blog
Launch

Vizipy Launch: Our 14-Day MVP Journey

Atanas Ivanov·May 27, 2025·10 min read

Day 0: The Problem Statement

We'd been on both sides of the visual bug equation: shipping broken UIs to clients and catching them too late. After yet another "the button disappeared" incident, we decided to build what we wished existed — visual regression testing that lives natively in GitHub PRs.

The constraint: 14 days from first commit to public MVP.

The Tech Stack Decision

We needed speed without sacrificing reliability. Here's what we chose:

  • TypeScript — end to end, no context switching
  • Playwright — for deterministic browser screenshots (not Puppeteer — Playwright's auto-wait and multi-browser support won us over)
  • Pixelmatch — pixel-level image comparison at sub-millisecond speed
  • OpenAI API — for generating human-readable summaries of visual changes ("The button shifted 12px left, which may affect the click area on mobile")
  • GitHub Actions — the runtime for everything

Day 1-3: The Screenshot Engine

The first challenge was deterministic screenshots. Browsers render slightly differently depending on fonts, animations, anti-aliasing, and timing. A naive screenshot would produce false positives on every single run.

Our approach:

  • Disable animations — inject CSS that sets * { animation: none !important; transition: none !important; }
  • Wait for network idle — no pending requests means no loading spinners
  • Font loading — explicitly wait for document.fonts.ready
  • Consistent viewport — lock width and pixel ratio across all runs

By day 3, we had a screenshot engine that produced byte-identical images across consecutive runs of the same page. Zero false positives.

Day 4-6: The Baseline Strategy

The key insight: your baseline is the main branch. When a PR opens, we compare screenshots of the PR branch against the same pages on main. This means:

  • No manual baseline management
  • Baselines automatically update when PRs merge
  • Branch-specific changes are isolated correctly
  • No "approve all" fatigue from cascading baseline drift

We store baseline images as artifacts attached to the workflow run on the main branch's HEAD commit.

Day 7-9: The Diff Engine

Pixelmatch gives us a pixel-level diff, but raw pixel counts aren't useful for developers. "1,247 pixels changed" doesn't tell you anything actionable.

So we built a layer on top:

  • Cluster changed pixels into bounding boxes (connected components analysis)
  • Classify each region by size, position, and type of change
  • Generate a severity score — a 2px anti-aliasing difference scores low; a missing button scores critical
  • Pipe the diff regions into GPT-4 with the page context for natural language summaries

Day 10-12: The GitHub Integration

This was the "it has to feel native" phase. We wanted the experience to be: open a PR, see visual changes inline, never leave GitHub.

The bot posts a PR comment with:

  • A summary of all visual changes across all tested pages
  • Before/after screenshot pairs for each changed page
  • AI-generated explanation of what changed and why it matters
  • A status check that blocks merge if regressions exceed the threshold

We chose to ship as a GitHub Action first (not a GitHub App) because:

  • Zero authentication complexity for users
  • Runs in the user's own CI environment
  • No server infrastructure to manage on our end
  • Users can see exactly what runs in their workflow file

Day 13-14: Polish and Launch

The last two days were all about developer experience:

  • Clear error messages when screenshots fail
  • Helpful PR comments even when there are zero visual changes ("All 5 pages match baseline")
  • A vizipy.config.ts file for customizing routes, viewports, and thresholds
  • Documentation and a 5-minute quick start guide

What We Shipped

On day 14, we had:

  • A working GitHub Action that screenshots pages and compares against main
  • AI-powered visual diff summaries posted as PR comments
  • Support for multiple viewports (desktop + mobile)
  • Baseline management that "just works" with the baseline=main strategy
  • Sub-2-minute run times for typical 5-page setups

What's Next

The MVP validates the core loop: screenshot → compare → comment → block. Now we're building toward:

  • Ignore regions — mask headers, timestamps, and dynamic content
  • Component-level snapshots — test individual components, not just full pages
  • Smarter baselines — per-branch baseline management for long-lived feature branches
  • A dashboard — historical trends, flakiness tracking, and team-wide visibility

The Takeaway

14 days is tight, but constraint breeds focus. We didn't build a platform — we built a sharp tool that does one thing well: catch visual bugs before they ship.

If you're building something similar, our advice: start with the narrowest possible use case and make it feel magical before expanding scope.


*Want to try what we built? Get started with Vizipy for free — it takes less than 5 minutes to set up.*

Catch visual bugs before they ship

Vizipy runs on every PR, generates before/after screenshots, and posts visual diffs right in GitHub.