Skip to content

Build Pipeline v2 — Design Document

Date: 2026-04-07 Status: Proposed Context: Lessons learned from Waterford County Painters build — the v1 approach of independent per-page generation produces inconsistent nav, footer, colours, and image placement. Patching after the fact is fragile and expensive.


Architecture Overview

┌─────────────────────────────────────────────────────────┐
│  PHASE 1: SCRAPE (existing, working)                    │
│  content-map.json + screenshots + assets + design-tokens│
└──────────────────────┬──────────────────────────────────┘
┌──────────────────────▼──────────────────────────────────┐
│  PHASE 2: SHARED FOUNDATION (2-3 Claude calls)          │
│  Nav.astro + Footer.astro + global.css + BaseLayout     │
│  ► Human gate: review before proceeding                 │
└──────────────────────┬──────────────────────────────────┘
┌──────────────────────▼──────────────────────────────────┐
│  PHASE 3: PER-PAGE BODY BUILD (1 call per page)         │
│  Each page = body content only, imports shared components│
│  Constrained: correct images from content map only       │
└──────────────────────┬──────────────────────────────────┘
┌──────────────────────▼──────────────────────────────────┐
│  PHASE 4: VERIFY (automated, 0 Claude calls)            │
│  Astro build → screenshot each page → compare to orig   │
│  Output: QA report with pass/fail per page               │
└──────────────────────┬──────────────────────────────────┘
┌──────────────────────▼──────────────────────────────────┐
│  PHASE 5: FIX (targeted, only failing pages)            │
│  Input: original + built screenshots + failure reason    │
│  Loop Phase 4→5 until all pages pass (max 3 iterations)  │
└─────────────────────────────────────────────────────────┘

Phase 1: Scrape

Status: Complete, working well after scraper rebuild.

Input: URL Output: - content-map.json — hierarchical content tree per page (headings, text, images, cards, icons, social links, nav tree, footer, beforeAfter, carousels, reviews) - screenshots/desktop/{slug}.png — full-page screenshots - screenshots/mobile/{slug}.png — mobile screenshots - assets/ — downloaded images, logos, icons - asset-manifest.json — maps original CDN URLs → local paths - design-tokens.json — colours, fonts, font sizes extracted from stylesheets

No changes needed. This phase is solid.


Phase 2: Shared Foundation

Goal: Generate the consistent pieces that every page shares. Get them right ONCE.

Step 2.1: Extract Colour Palette → global.css

Input: design-tokens.json Method: Deterministic (no Claude call needed) Output: src/styles/global.css with CSS custom properties

:root {
  --color-primary: #2b5672;
  --color-accent: #1e73be;
  --color-text: #333333;
  --color-text-light: #666666;
  --color-bg: #ffffff;
  --color-bg-alt: #f9f9f9;
  --color-border: #e5e5e5;
  --font-heading: "Montserrat", sans-serif;
  --font-body: "Open Sans", sans-serif;
}

How: Take the top 2 non-white/non-transparent colours as primary/accent. Top fonts as heading/body. No AI needed — just math.

Step 2.2: Generate Nav Component

Input: Homepage screenshot + content-map.homepage.nav + content-map.homepage.socialLinks + logo path Method: 1 Claude call Prompt constraints: - Must use CSS variables from global.css (not hardcoded hex values) - Must include contact bar (phone, email, areas — left; social icons — right) - Must match the nav layout from the screenshot (logo position, link grouping, dropdowns) - Must include mobile hamburger menu - Must be a self-contained Astro component with <style> and <script>

Output: src/components/Nav.astro

Input: Homepage screenshot + content-map.homepage.footer + social links + copyright Method: 1 Claude call Prompt constraints: - Must use CSS variables from global.css - Must match footer layout from screenshot - Self-contained Astro component

Output: src/components/Footer.astro

Step 2.4: Generate BaseLayout

Method: Deterministic (template, no Claude call) Output: src/layouts/BaseLayout.astro

---
import Nav from "../components/Nav.astro";
import Footer from "../components/Footer.astro";
import "../styles/global.css";

const { title, description } = Astro.props;
---
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>{title}</title>
  <meta name="description" content={description}>
</head>
<body>
  <Nav />
  <main>
    <slot />
  </main>
  <Footer />
</body>
</html>

Human Gate

After Phase 2, deploy JUST the homepage with placeholder body content so the human can verify: - Nav layout is correct (logo position, link grouping, dropdowns) - Footer looks right - Colours match - Mobile menu works

Only proceed to Phase 3 after human approval. This prevents wasting 24 Claude calls on pages that will all have the wrong nav.


Phase 3: Per-Page Body Build

Goal: Generate the unique body content for each page. Each page gets its own Claude call but shares the foundation from Phase 2.

Input per page:

  1. screenshots/desktop/{slug}.png — the original page screenshot
  2. content-map[slug] — the page's content tree (sections, images, text)
  3. Correct image list — extracted from content-map, only images that belong on THIS page
  4. global.css colour variables — page must use var(--color-primary) not #2b5672

Prompt structure:

SYSTEM: You are generating the BODY CONTENT ONLY for an Astro page.
The page uses BaseLayout which provides Nav, Footer, and global styles.
Output ONLY the content that goes inside the <main> slot.
Use CSS custom properties (var(--color-primary), etc.) — never hardcode colours.

PAGE: {slug}
SCREENSHOT: [attached]
SECTIONS: [from content-map]
IMAGES FOR THIS PAGE:
  Hero: /assets/images/farm-painting-hero.jpg (alt: Farm painting)
  Section 2 (cards):
    /assets/images/card1.jpg (alt: Barn painting)
    /assets/images/card2.jpg (alt: Shed painting)
  Section 3 (beforeAfter):
    /assets/images/before.jpg (before)
    /assets/images/after.jpg (after)
  ...

TEXT CONTENT: [from content-map text nodes]

Generate the page body. Use ONLY the images listed above in their correct sections.

Output per page:

---
import BaseLayout from "../layouts/BaseLayout.astro";
---
<BaseLayout title="Farm Painting" description="...">
  <section class="hero">...</section>
  <section class="services">...</section>
  ...
</BaseLayout>

Key constraint: Image mapping is EXPLICIT

The prompt doesn't give Claude a list of 150 images to pick from. It gives the exact images for each section, extracted from the content map. Claude's only job is to place them in the right HTML structure. This eliminates the "wrong image in wrong section" problem.


Phase 4: Verify

Goal: Automated QA with zero Claude calls. Catch issues before human review.

Step 4.1: Compile Check

npx astro build
Binary pass/fail. If any page fails to compile, log the error.

Step 4.2: Screenshot Built Pages

// For each page, launch Playwright, navigate to built page, take screenshot
for (const slug of slugs) {
  await page.goto(`http://localhost:4321/${slug}`);
  await page.screenshot({ path: `qa/built/${slug}.png`, fullPage: true });
}

Step 4.3: Compare Screenshots

For each page, compare screenshots/desktop/{slug}.png (original) vs qa/built/{slug}.png (built):

  1. Structural similarity — are the section counts the same? Are they in the same order? (Compare section heights/positions)
  2. Image verification — are the correct images present in the built HTML? (Parse HTML, check src attributes against content-map)
  3. Colour check — extract dominant colours from built screenshot, compare to design tokens
  4. Content check — extract text from built HTML, compare word count to content-map text (catch missing content)

Output: QA Report

{
  "homepage": { "compile": true, "imageAccuracy": 0.95, "contentCoverage": 0.88, "colourMatch": true, "pass": true },
  "farm-painting": { "compile": true, "imageAccuracy": 0.60, "contentCoverage": 0.92, "colourMatch": true, "pass": false },
  ...
}

Pages scoring below threshold (e.g. 80% on any metric) go to Phase 5.


Phase 5: Fix

Goal: Targeted fixes for failing pages only. Uses Claude Vision to compare original vs built.

Input per failing page:

  1. Original screenshot
  2. Built screenshot
  3. Current .astro code
  4. Specific failure reasons from QA report (e.g. "imageAccuracy: 60% — missing 3 images", "contentCoverage: 70% — missing FAQ section")

Prompt:

Here are TWO screenshots:
1. ORIGINAL (target) — what the page should look like
2. BUILT (current) — what we generated

The current code is below. Fix ONLY the specific issues listed:
- Missing images: [list]
- Missing section: FAQ accordion
- Image in wrong position: farm-repairs.jpg is in hero, should be in section 3

Output the COMPLETE fixed file.

Loop:

Phase 4 → Phase 5 → Phase 4 → Phase 5 → ...
Max 3 iterations. If still failing after 3, flag for human review.

Cost Model

Phase Claude Calls Estimated Cost
Phase 1: Scrape 0 $0
Phase 2: Foundation 2-3 ~$0.05
Phase 3: Page builds N pages ~$0.02/page
Phase 4: Verify 0 $0
Phase 5: Fixes ~20% of N ~$0.02/fix

For a 24-page site: 3 + 24 + ~5 fixes = ~32 Claude calls ≈ $0.70

At scale (1800 sites, avg 15 pages): - Per site: ~$0.50 - Total: ~$900 over 12 months - Per month (150 sites): ~$75/mo


Implementation Order

Sprint 1: Pipeline Skeleton

  • build-v2.js — orchestrator that runs Phases 2-5 in sequence
  • Phase 2.1: extractPalette() — deterministic colour extraction
  • Phase 2.4: generateBaseLayout() — template generation (no AI)
  • Phase 3: Restructure page prompt to use BaseLayout + body-only output

Sprint 2: Shared Components

  • Phase 2.2: generateNav() — Claude call with strict constraints
  • Phase 2.3: generateFooter() — Claude call with strict constraints
  • Human gate: deploy foundation for review before page builds

Sprint 3: Image Mapping

  • buildImageMap(slug, contentMap) — extracts per-section image list from content map
  • Integrate into Phase 3 prompt so each page gets explicit image instructions

Sprint 4: Automated QA

  • Phase 4.1: Compile check
  • Phase 4.2: Screenshot built pages with Playwright
  • Phase 4.3: Compare original vs built (structural + image + colour + content)
  • Phase 4 output: QA report JSON

Sprint 5: Fix Loop

  • Phase 5: Claude Vision comparison with dual screenshots
  • Loop controller: Phase 4 → 5 → 4 → 5 (max 3)
  • Fallback: flag for human review

Key Lessons from v1

  1. Never generate nav/footer per page. Generate once, share everywhere.
  2. Never give Claude 150 images to pick from. Map images to sections explicitly.
  3. Never hardcode colours in page prompts. Use CSS variables from a single source.
  4. Always verify before deploying. Screenshot comparison catches issues cheaply.
  5. The scraper is not the bottleneck anymore. The build agent prompt engineering is.
  6. One-shot generation doesn't work. An iterative loop with verification is necessary.
  7. Most pages need 1 fix pass, not 3. Target the 20% that fail, don't re-run everything.

Scraper Gap: Multi-Nav Detection

The Waterford site has TWO <nav> elements (left + right of logo). Our scraper only captures the first <nav>. This needs fixing:

  • Update content.js to capture ALL <nav> elements
  • Tag them with position info (left/right) or merge into a single nav tree
  • Also capture nav items from <header> that aren't inside <nav> (Wix sometimes puts links outside nav)

This is a Phase 1 fix that will improve all downstream phases.