Build Pipeline v2 — Design Document¶

Date: 2026-04-07 Status: Proposed Context: Lessons learned from Waterford County Painters build — the v1 approach of independent per-page generation produces inconsistent nav, footer, colours, and image placement. Patching after the fact is fragile and expensive.

Architecture Overview¶

┌─────────────────────────────────────────────────────────┐
│  PHASE 1: SCRAPE (existing, working)                    │
│  content-map.json + screenshots + assets + design-tokens│
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│  PHASE 2: SHARED FOUNDATION (2-3 Claude calls)          │
│  Nav.astro + Footer.astro + global.css + BaseLayout     │
│  ► Human gate: review before proceeding                 │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│  PHASE 3: PER-PAGE BODY BUILD (1 call per page)         │
│  Each page = body content only, imports shared components│
│  Constrained: correct images from content map only       │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│  PHASE 4: VERIFY (automated, 0 Claude calls)            │
│  Astro build → screenshot each page → compare to orig   │
│  Output: QA report with pass/fail per page               │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│  PHASE 5: FIX (targeted, only failing pages)            │
│  Input: original + built screenshots + failure reason    │
│  Loop Phase 4→5 until all pages pass (max 3 iterations)  │
└─────────────────────────────────────────────────────────┘

Phase 1: Scrape¶

Status: Complete, working well after scraper rebuild.

Input: URL Output: - content-map.json — hierarchical content tree per page (headings, text, images, cards, icons, social links, nav tree, footer, beforeAfter, carousels, reviews) - screenshots/desktop/{slug}.png — full-page screenshots - screenshots/mobile/{slug}.png — mobile screenshots - assets/ — downloaded images, logos, icons - asset-manifest.json — maps original CDN URLs → local paths - design-tokens.json — colours, fonts, font sizes extracted from stylesheets

No changes needed. This phase is solid.

Phase 2: Shared Foundation¶

Goal: Generate the consistent pieces that every page shares. Get them right ONCE.

Step 2.1: Extract Colour Palette → `global.css`¶

Input: design-tokens.json Method: Deterministic (no Claude call needed) Output: src/styles/global.css with CSS custom properties

:root {
  --color-primary: #2b5672;
  --color-accent: #1e73be;
  --color-text: #333333;
  --color-text-light: #666666;
  --color-bg: #ffffff;
  --color-bg-alt: #f9f9f9;
  --color-border: #e5e5e5;
  --font-heading: "Montserrat", sans-serif;
  --font-body: "Open Sans", sans-serif;
}

How: Take the top 2 non-white/non-transparent colours as primary/accent. Top fonts as heading/body. No AI needed — just math.

Step 2.2: Generate Nav Component¶

Input: Homepage screenshot + content-map.homepage.nav + content-map.homepage.socialLinks + logo path Method: 1 Claude call Prompt constraints: - Must use CSS variables from global.css (not hardcoded hex values) - Must include contact bar (phone, email, areas — left; social icons — right) - Must match the nav layout from the screenshot (logo position, link grouping, dropdowns) - Must include mobile hamburger menu - Must be a self-contained Astro component with <style> and <script>

Output: src/components/Nav.astro

Input: Homepage screenshot + content-map.homepage.footer + social links + copyright Method: 1 Claude call Prompt constraints: - Must use CSS variables from global.css - Must match footer layout from screenshot - Self-contained Astro component

Output: src/components/Footer.astro

Step 2.4: Generate BaseLayout¶

Method: Deterministic (template, no Claude call) Output: src/layouts/BaseLayout.astro

---
import Nav from "../components/Nav.astro";
import Footer from "../components/Footer.astro";
import "../styles/global.css";

const { title, description } = Astro.props;
---
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>{title}</title>
  <meta name="description" content={description}>
</head>
<body>
  <Nav />
  <main>
    <slot />
  </main>
  <Footer />
</body>
</html>

Human Gate¶

After Phase 2, deploy JUST the homepage with placeholder body content so the human can verify: - Nav layout is correct (logo position, link grouping, dropdowns) - Footer looks right - Colours match - Mobile menu works

Only proceed to Phase 3 after human approval. This prevents wasting 24 Claude calls on pages that will all have the wrong nav.

Phase 3: Per-Page Body Build¶

Goal: Generate the unique body content for each page. Each page gets its own Claude call but shares the foundation from Phase 2.

Input per page:¶

screenshots/desktop/{slug}.png — the original page screenshot
content-map[slug] — the page's content tree (sections, images, text)
Correct image list — extracted from content-map, only images that belong on THIS page
global.css colour variables — page must use var(--color-primary) not #2b5672

Prompt structure:¶

SYSTEM: You are generating the BODY CONTENT ONLY for an Astro page.
The page uses BaseLayout which provides Nav, Footer, and global styles.
Output ONLY the content that goes inside the <main> slot.
Use CSS custom properties (var(--color-primary), etc.) — never hardcode colours.

PAGE: {slug}
SCREENSHOT: [attached]
SECTIONS: [from content-map]
IMAGES FOR THIS PAGE:
  Hero: /assets/images/farm-painting-hero.jpg (alt: Farm painting)
  Section 2 (cards):
    /assets/images/card1.jpg (alt: Barn painting)
    /assets/images/card2.jpg (alt: Shed painting)
  Section 3 (beforeAfter):
    /assets/images/before.jpg (before)
    /assets/images/after.jpg (after)
  ...

TEXT CONTENT: [from content-map text nodes]

Generate the page body. Use ONLY the images listed above in their correct sections.

Output per page:¶

---
import BaseLayout from "../layouts/BaseLayout.astro";
---
<BaseLayout title="Farm Painting" description="...">
  <section class="hero">...</section>
  <section class="services">...</section>
  ...
</BaseLayout>

Key constraint: Image mapping is EXPLICIT¶

The prompt doesn't give Claude a list of 150 images to pick from. It gives the exact images for each section, extracted from the content map. Claude's only job is to place them in the right HTML structure. This eliminates the "wrong image in wrong section" problem.

Phase 4: Verify¶

Goal: Automated QA with zero Claude calls. Catch issues before human review.

Step 4.1: Compile Check¶

npx astro build

Binary pass/fail. If any page fails to compile, log the error.

Step 4.2: Screenshot Built Pages¶

// For each page, launch Playwright, navigate to built page, take screenshot
for (const slug of slugs) {
  await page.goto(`http://localhost:4321/${slug}`);
  await page.screenshot({ path: `qa/built/${slug}.png`, fullPage: true });
}

Step 4.3: Compare Screenshots¶

For each page, compare screenshots/desktop/{slug}.png (original) vs qa/built/{slug}.png (built):

Structural similarity — are the section counts the same? Are they in the same order? (Compare section heights/positions)
Image verification — are the correct images present in the built HTML? (Parse HTML, check src attributes against content-map)
Colour check — extract dominant colours from built screenshot, compare to design tokens
Content check — extract text from built HTML, compare word count to content-map text (catch missing content)

Output: QA Report¶

{
  "homepage": { "compile": true, "imageAccuracy": 0.95, "contentCoverage": 0.88, "colourMatch": true, "pass": true },
  "farm-painting": { "compile": true, "imageAccuracy": 0.60, "contentCoverage": 0.92, "colourMatch": true, "pass": false },
  ...
}

Pages scoring below threshold (e.g. 80% on any metric) go to Phase 5.

Phase 5: Fix¶

Goal: Targeted fixes for failing pages only. Uses Claude Vision to compare original vs built.

Input per failing page:¶

Original screenshot
Built screenshot
Current .astro code
Specific failure reasons from QA report (e.g. "imageAccuracy: 60% — missing 3 images", "contentCoverage: 70% — missing FAQ section")

Prompt:¶

Here are TWO screenshots:
1. ORIGINAL (target) — what the page should look like
2. BUILT (current) — what we generated

The current code is below. Fix ONLY the specific issues listed:
- Missing images: [list]
- Missing section: FAQ accordion
- Image in wrong position: farm-repairs.jpg is in hero, should be in section 3

Output the COMPLETE fixed file.

Loop:¶

Phase 4 → Phase 5 → Phase 4 → Phase 5 → ...
Max 3 iterations. If still failing after 3, flag for human review.

Cost Model¶

Phase	Claude Calls	Estimated Cost
Phase 1: Scrape	0	$0
Phase 2: Foundation	2-3	~$0.05
Phase 3: Page builds	N pages	~$0.02/page
Phase 4: Verify	0	$0
Phase 5: Fixes	~20% of N	~$0.02/fix

For a 24-page site: 3 + 24 + ~5 fixes = ~32 Claude calls ≈ $0.70

At scale (1800 sites, avg 15 pages): - Per site: ~$0.50 - Total: ~$900 over 12 months - Per month (150 sites): ~$75/mo

Implementation Order¶

Sprint 1: Pipeline Skeleton¶

build-v2.js — orchestrator that runs Phases 2-5 in sequence
Phase 2.1: extractPalette() — deterministic colour extraction
Phase 2.4: generateBaseLayout() — template generation (no AI)
Phase 3: Restructure page prompt to use BaseLayout + body-only output

Sprint 2: Shared Components¶

Phase 2.2: generateNav() — Claude call with strict constraints
Phase 2.3: generateFooter() — Claude call with strict constraints
Human gate: deploy foundation for review before page builds

Sprint 3: Image Mapping¶

buildImageMap(slug, contentMap) — extracts per-section image list from content map
Integrate into Phase 3 prompt so each page gets explicit image instructions

Sprint 4: Automated QA¶

Phase 4.1: Compile check
Phase 4.2: Screenshot built pages with Playwright
Phase 4.3: Compare original vs built (structural + image + colour + content)
Phase 4 output: QA report JSON

Sprint 5: Fix Loop¶

Phase 5: Claude Vision comparison with dual screenshots
Loop controller: Phase 4 → 5 → 4 → 5 (max 3)
Fallback: flag for human review

Key Lessons from v1¶

Never generate nav/footer per page. Generate once, share everywhere.
Never give Claude 150 images to pick from. Map images to sections explicitly.
Never hardcode colours in page prompts. Use CSS variables from a single source.
Always verify before deploying. Screenshot comparison catches issues cheaply.
The scraper is not the bottleneck anymore. The build agent prompt engineering is.
One-shot generation doesn't work. An iterative loop with verification is necessary.
Most pages need 1 fix pass, not 3. Target the 20% that fail, don't re-run everything.

Scraper Gap: Multi-Nav Detection¶

The Waterford site has TWO <nav> elements (left + right of logo). Our scraper only captures the first <nav>. This needs fixing:

Update content.js to capture ALL <nav> elements
Tag them with position info (left/right) or merge into a single nav tree
Also capture nav items from <header> that aren't inside <nav> (Wix sometimes puts links outside nav)

This is a Phase 1 fix that will improve all downstream phases.