Open-source MCP server · MIT License · by Mathew Graham

AI wrote the code.
Does it actually
match the design?

AI coding agents ship UI fast — but fast isn't correct. Wrong tokens, missing states, off-by-4px padding. DesignDiff verifies that AI-generated interfaces match your design system before they reach production.

Token drift detected — when agents hardcode values instead of referencing design tokens
Missing states flagged — hover, focus, disabled, error variants compared against Figma
Patch-ready diffs returned — for your agent to apply and re-check in the same session
Figma — design spec
Button / Primary · Default
Get started
16px padding ✓
token reference ✓
States defined:
hover
focus
disabled
Property
Figma value
padding
16px 24px
correct
background
token(brand-blue)
token ✓
border-radius
4px
correct
hover:bg
token(brand-blue-700)
defined
focus:ring
2px offset-2
defined
Code output · Parity 52/100
src/components/Button.tsx · rendered
Get started
12px padding ✗
hardcoded #2563EB ✗
States — all missing:
no hover
no focus
no disabled
Property
Code (computed)
Issue
padding
12px 24px
off by 4px
background
#2563EB
hardcoded
border-radius
4px
hover:bg
missing
no feedback
focus:ring
missing
WCAG 2.4.7
52 / 100 — below threshold
Patch-ready fix generated — agent applies and re-checks
💡 Pattern: AI-generated code signature — correct structure, hardcoded colors, missing states. Agent generated from Figma without access to the token file.
Button.tsxPatch-ready
- padding: 12px 24px;
+ padding: 16px 24px;
- background: #2563EB;
+ background: var(--brand-blue-600);
✓ Agent re-checks: 94/100 · parity restored
Why it exists

AI ships UI fast.
Fast isn't correct.

What goes wrong when AI builds UI

When you ask Claude Code or Cursor to build a component from Figma, the agent reads the spec and writes code that looks correct. But it typically hardcodes a color value (#2563EB) instead of referencing the token (var(--brand-blue)), gets padding off by a few pixels, and skips interactive states in Figma variants it didn't fully parse.

The component looks fine visually. That's what makes this dangerous. Errors only surface when a brand refresh doesn't propagate, a keyboard user can't navigate, or an accessibility audit fails.

The same drift happens when developers implement manually, when designs change after code ships, and when tokens get renamed. It compounds silently across every PR.

What DesignDiff does about it

DesignDiff is an MCP server — a verification tool that runs inside your AI agent. After the agent generates a component, it calls DesignDiff to check the output before the code lands in a PR.

1
Reads Figma precisely — extracts exact pixel values, token references, and every variant state. Not a screenshot — raw design data including every Figma variable.
2
Renders in a real browser — Playwright captures computed CSS after the full cascade. Source code is intent. Computed styles are what users actually see.
3
Scores, explains, and suggests fixes — every mismatch gets a consequence, a severity, and a patch-ready diff the agent can apply in the same response.
When the gap happens
🤖
AI builds a component from Figma
The most common case. Agent generates code that looks right but has token, spacing, or state errors. DesignDiff catches them before the PR merges.
✏️
Developer implements manually
Developer eyeballs spacing, guesses token names, skips a Figma variant. Subtle drift accumulates across every component they touch.
🎨
Design changes after code ships
Designer updates spacing in Figma. Nobody tells the developer. Component in production quietly diverges. Nobody notices for months.
🔄
Tokens get renamed
System renames brand-blue to color-primary. Components using old names fall back to hardcoded values — until the next brand refresh.
Quality Score — where this is heading

One score.
Safe to ship.

Today DesignDiff scores rendered parity. The goal is a single score across five verification dimensions — one number CI gates, managers, and developers all understand without translation.

Button/Primary · rendered parity check · v0.1
94/100
Rendered Parity Score
Design Fidelity✓ PASS
Token Compliance✓ PASS
Spacing✓ PASS
Typography✓ PASS
Borders✓ PASS
State Coveragev0.2
Accessibilityv0.2
Responsive Layoutv0.2
hover:bg state missing — no interaction feedback on desktop pointer. Fix: &:hover { background: var(--brand-blue-700) }
What v0.1 scores today
Design Fidelity — spacing, color, and typography match the Figma spec within configurable tolerances
Token Compliance — no hardcoded values bypassing your design system
Spacing — padding, margin, and gap values match the spec
Typography — font size, weight, line-height
Borders — radius, width, color
CI gate — where it's going
✓ Button/Primary: 94/100
✓ Input/Text: 98/100
✓ Modal/Confirm: 91/100
✕ Card/Product: 61/100
PR blocked when any component drops below threshold
How it works

Verify. Score.
Patch. Ship.

Runs inside your existing AI agent. No new dashboard, no browser extension. Human review focuses on product decisions, not token drift.

01 — Fetch
Pull design intent from Figma
Calls the Figma REST API with your node_id. Extracts exact pixel values, token references, and every interactive state variant. URL-encodes node IDs correctly — a common agent mistake that causes silent 404s.
Figma REST API
GET /files/{id}/nodes?ids=123%3A456
component "Button / Primary"
padding 16px 24px
background token(brand-blue-600)
states hover · focus · disabled
tokens 3 variables resolved
02 — Render
Capture what the browser actually shows
Launches a headless Playwright browser and renders the component at its real URL — Storybook, localhost, or staging. Captures computed CSS after the full cascade. Source code is intent. Computed styles are truth.
Property
Figma says
Browser shows
padding
16px 24px
12px 24px
background
token(brand-blue)
#2563EB
border-radius
4px
4px
hover state
defined
missing
focus ring
2px offset-2
missing
03 — Score
Consequence-first parity scoring
Token violations score highest — they bypass your design system and break on every brand refresh. Missing focus rings are a WCAG 2.4.7 risk. Every mismatch includes the real consequence of not fixing it.
52 / 100
Below threshold — patch-ready fix generated
Color tokensFAIL — hardcoded
StatesFAIL — 3 missing
SpacingWARN
TypographyPASS
BordersPASS
04 — Fix
Patch-ready fix in the same response
When score is below threshold, DesignDiff generates a patch-ready diff and returns it immediately — no second tool call. Your agent applies it, then re-verifies to confirm the score improved.
generate_sync_patch → agent applies → re-check
- padding: 12px 24px;
+ padding: 16px 24px;
- background: #2563EB;
+ background: var(--brand-blue-600);
✓ Re-check: 94/100 · parity restored
Installation

Running in
five minutes.

One config entry. Your agent verifies every component it generates or modifies. Works best with Storybook, local preview routes, or any stable component URL your agent can reach.

~/.config/claude/mcp.json — Claude Code
{
"mcpServers": {
"designdiff": {
"command": "npx",
"args": ["-y", "designdiff-mcp"],
"env": { "FIGMA_API_KEY": "your-token" }
}
}
}
Cursor · Windsurf · VS Code Copilot
Same config format. Works with any MCP-compatible agent.
Say to your agent:
// Agent calls DesignDiff automatically
"Build the Button from Figma file abc123, node 123:456, at localhost:6006 — and verify it matches the spec"
→ Builds · Verifies · Suggests fix · Agent re-checks: 94/100
Claude Code
Cursor
Windsurf
VS Code Copilot
Zed
Any MCP client
MCP Tools

Parity today.
Full verification tomorrow.

The core engine ships now. Each new tool adds another dimension to the quality score — another check AI-generated code passes before it ships.

Core · ships now
check_component_parity
(node_id, component_url, code_path)
Diffs Figma spec against computed CSS. Scores spacing, tokens, typography, borders, states. Returns a patch-ready fix when score drops below threshold.
Core · ships now
flag_stale_mappings
(file_id, repo_root)
Cross-references Code Connect mappings against git history. Surfaces components that have silently drifted, ranked by blast radius and days since divergence.
Core · ships now
generate_sync_patch
(node_id, code_path, format?)
Generates a surgical patch — not a rewrite. Token substitutions, prop corrections, missing state stubs. Git diff, JSX, or CSS delta output.
v0.2 · Q3 2026 coming
audit_state_coverage
(node_id, component_url)
Checks whether every Figma-defined state exists in code. Flags WCAG 2.4.7 risk for missing focus rings and disabled states.
v0.2 · Q3 2026 coming
check_responsive_parity
(node_id, component_url, breakpoints?)
Renders at 375/768/1280/1920px. Catches overflow, collapsed containers, wrong flex direction. The most common mobile implementation gap.
v0.2 · Q3 2026 coming
check_theme_parity
(node_id, component_url, themes?)
Parity across Figma variable modes — light/dark, brand variants. Catches hardcoded colors that are invisible in light but break in dark mode.
Roadmap

What works today.
Where it's going.

Now — v0.1
Rendered parity check
Point DesignDiff at a component URL and a Figma node. Get a scored diff of spacing, color tokens, typography, and borders — plus a patch-ready fix for everything that's off.

check_component_parity · flag_stale_mappings · generate_sync_patch · React/CSS · MIT License
Next — v0.2
State, theme, and responsive coverage
Verify that every Figma-defined state exists in code. Check rendering across breakpoints and light/dark themes.

audit_state_coverage · check_responsive_parity · check_theme_parity · Storybook adapter
Later — v1.0
Team quality gates
A unified Implementation Quality Score across all dimensions. Block PRs below threshold in CI. Track score history over time. The verification layer between AI-generated UI and production.

DesignDiff verifies that AI-generated UI
matches your design system before it ships.

It reads Figma, renders the live component, compares computed browser output, and returns patch-ready fixes inside your coding agent.

View on GitHub