Launch

Vizipy Launch: Our 14-Day MVP Journey

Atanas Ivanov·May 27, 2025·10 min read

Day 0: The Problem Statement

We'd been on both sides of the visual bug equation: shipping broken UIs to clients and catching them too late. After yet another "the button disappeared" incident, we decided to build what we wished existed — visual regression testing that lives natively in GitHub PRs.

The constraint: 14 days from first commit to public MVP.

The Tech Stack Decision

We needed speed without sacrificing reliability. Here's what we chose:

TypeScript — end to end, no context switching
Playwright — for deterministic browser screenshots (not Puppeteer — Playwright's auto-wait and multi-browser support won us over)
Pixelmatch — pixel-level image comparison at sub-millisecond speed
OpenAI API — for generating human-readable summaries of visual changes ("The button shifted 12px left, which may affect the click area on mobile")
GitHub Actions — the runtime for everything

Day 1-3: The Screenshot Engine

The first challenge was deterministic screenshots. Browsers render slightly differently depending on fonts, animations, anti-aliasing, and timing. A naive screenshot would produce false positives on every single run.

Our approach:

Disable animations — inject CSS that sets * { animation: none !important; transition: none !important; }
Wait for network idle — no pending requests means no loading spinners
Font loading — explicitly wait for document.fonts.ready
Consistent viewport — lock width and pixel ratio across all runs

By day 3, we had a screenshot engine that produced byte-identical images across consecutive runs of the same page. Zero false positives.

Day 4-6: The Baseline Strategy

The key insight: your baseline is the main branch. When a PR opens, we compare screenshots of the PR branch against the same pages on main. This means:

No manual baseline management
Baselines automatically update when PRs merge
Branch-specific changes are isolated correctly
No "approve all" fatigue from cascading baseline drift

We store baseline images as artifacts attached to the workflow run on the main branch's HEAD commit.

Day 7-9: The Diff Engine

Pixelmatch gives us a pixel-level diff, but raw pixel counts aren't useful for developers. "1,247 pixels changed" doesn't tell you anything actionable.

So we built a layer on top:

Cluster changed pixels into bounding boxes (connected components analysis)
Classify each region by size, position, and type of change
Generate a severity score — a 2px anti-aliasing difference scores low; a missing button scores critical
Pipe the diff regions into GPT-4 with the page context for natural language summaries

Day 10-12: The GitHub Integration

This was the "it has to feel native" phase. We wanted the experience to be: open a PR, see visual changes inline, never leave GitHub.

The bot posts a PR comment with:

A summary of all visual changes across all tested pages
Before/after screenshot pairs for each changed page
AI-generated explanation of what changed and why it matters
A status check that blocks merge if regressions exceed the threshold

We chose to ship as a GitHub Action first (not a GitHub App) because:

Zero authentication complexity for users
Runs in the user's own CI environment
No server infrastructure to manage on our end
Users can see exactly what runs in their workflow file

Day 13-14: Polish and Launch

The last two days were all about developer experience:

Clear error messages when screenshots fail
Helpful PR comments even when there are zero visual changes ("All 5 pages match baseline")
A vizipy.config.ts file for customizing routes, viewports, and thresholds
Documentation and a 5-minute quick start guide

What We Shipped

On day 14, we had:

A working GitHub Action that screenshots pages and compares against main
AI-powered visual diff summaries posted as PR comments
Support for multiple viewports (desktop + mobile)
Baseline management that "just works" with the baseline=main strategy
Sub-2-minute run times for typical 5-page setups

What's Next

The MVP validates the core loop: screenshot → compare → comment → block. Now we're building toward:

Ignore regions — mask headers, timestamps, and dynamic content
Component-level snapshots — test individual components, not just full pages
Smarter baselines — per-branch baseline management for long-lived feature branches
A dashboard — historical trends, flakiness tracking, and team-wide visibility

The Takeaway

14 days is tight, but constraint breeds focus. We didn't build a platform — we built a sharp tool that does one thing well: catch visual bugs before they ship.

If you're building something similar, our advice: start with the narrowest possible use case and make it feel magical before expanding scope.

*Want to try what we built? Get started with Vizipy for free — it takes less than 5 minutes to set up.*

Catch visual bugs before they ship

Vizipy runs on every PR, generates before/after screenshots, and posts visual diffs right in GitHub.

Visual Testing

Why Sentry Misses 61% of Bugs (And What to Do About It)

Traditional error monitoring is blind to visual regressions. A white button on a white background? No error. Just lost revenue.

Engineering

The $100 Saturday Panic Call: A Visual Bug Post-Mortem

A CSS change broke a client site over the weekend. Sentry stayed quiet. The client didn't.