Build Pipeline v2 — Design Document¶
Date: 2026-04-07 Status: Proposed Context: Lessons learned from Waterford County Painters build — the v1 approach of independent per-page generation produces inconsistent nav, footer, colours, and image placement. Patching after the fact is fragile and expensive.
Architecture Overview¶
┌─────────────────────────────────────────────────────────┐
│ PHASE 1: SCRAPE (existing, working) │
│ content-map.json + screenshots + assets + design-tokens│
└──────────────────────┬──────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────┐
│ PHASE 2: SHARED FOUNDATION (2-3 Claude calls) │
│ Nav.astro + Footer.astro + global.css + BaseLayout │
│ ► Human gate: review before proceeding │
└──────────────────────┬──────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────┐
│ PHASE 3: PER-PAGE BODY BUILD (1 call per page) │
│ Each page = body content only, imports shared components│
│ Constrained: correct images from content map only │
└──────────────────────┬──────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────┐
│ PHASE 4: VERIFY (automated, 0 Claude calls) │
│ Astro build → screenshot each page → compare to orig │
│ Output: QA report with pass/fail per page │
└──────────────────────┬──────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────┐
│ PHASE 5: FIX (targeted, only failing pages) │
│ Input: original + built screenshots + failure reason │
│ Loop Phase 4→5 until all pages pass (max 3 iterations) │
└─────────────────────────────────────────────────────────┘
Phase 1: Scrape¶
Status: Complete, working well after scraper rebuild.
Input: URL
Output:
- content-map.json — hierarchical content tree per page (headings, text, images, cards, icons, social links, nav tree, footer, beforeAfter, carousels, reviews)
- screenshots/desktop/{slug}.png — full-page screenshots
- screenshots/mobile/{slug}.png — mobile screenshots
- assets/ — downloaded images, logos, icons
- asset-manifest.json — maps original CDN URLs → local paths
- design-tokens.json — colours, fonts, font sizes extracted from stylesheets
No changes needed. This phase is solid.
Phase 2: Shared Foundation¶
Goal: Generate the consistent pieces that every page shares. Get them right ONCE.
Step 2.1: Extract Colour Palette → global.css¶
Input: design-tokens.json
Method: Deterministic (no Claude call needed)
Output: src/styles/global.css with CSS custom properties
:root {
--color-primary: #2b5672;
--color-accent: #1e73be;
--color-text: #333333;
--color-text-light: #666666;
--color-bg: #ffffff;
--color-bg-alt: #f9f9f9;
--color-border: #e5e5e5;
--font-heading: "Montserrat", sans-serif;
--font-body: "Open Sans", sans-serif;
}
How: Take the top 2 non-white/non-transparent colours as primary/accent. Top fonts as heading/body. No AI needed — just math.
Step 2.2: Generate Nav Component¶
Input: Homepage screenshot + content-map.homepage.nav + content-map.homepage.socialLinks + logo path
Method: 1 Claude call
Prompt constraints:
- Must use CSS variables from global.css (not hardcoded hex values)
- Must include contact bar (phone, email, areas — left; social icons — right)
- Must match the nav layout from the screenshot (logo position, link grouping, dropdowns)
- Must include mobile hamburger menu
- Must be a self-contained Astro component with <style> and <script>
Output: src/components/Nav.astro
Step 2.3: Generate Footer Component¶
Input: Homepage screenshot + content-map.homepage.footer + social links + copyright
Method: 1 Claude call
Prompt constraints:
- Must use CSS variables from global.css
- Must match footer layout from screenshot
- Self-contained Astro component
Output: src/components/Footer.astro
Step 2.4: Generate BaseLayout¶
Method: Deterministic (template, no Claude call)
Output: src/layouts/BaseLayout.astro
---
import Nav from "../components/Nav.astro";
import Footer from "../components/Footer.astro";
import "../styles/global.css";
const { title, description } = Astro.props;
---
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{title}</title>
<meta name="description" content={description}>
</head>
<body>
<Nav />
<main>
<slot />
</main>
<Footer />
</body>
</html>
Human Gate¶
After Phase 2, deploy JUST the homepage with placeholder body content so the human can verify: - Nav layout is correct (logo position, link grouping, dropdowns) - Footer looks right - Colours match - Mobile menu works
Only proceed to Phase 3 after human approval. This prevents wasting 24 Claude calls on pages that will all have the wrong nav.
Phase 3: Per-Page Body Build¶
Goal: Generate the unique body content for each page. Each page gets its own Claude call but shares the foundation from Phase 2.
Input per page:¶
screenshots/desktop/{slug}.png— the original page screenshotcontent-map[slug]— the page's content tree (sections, images, text)- Correct image list — extracted from content-map, only images that belong on THIS page
global.csscolour variables — page must usevar(--color-primary)not#2b5672
Prompt structure:¶
SYSTEM: You are generating the BODY CONTENT ONLY for an Astro page.
The page uses BaseLayout which provides Nav, Footer, and global styles.
Output ONLY the content that goes inside the <main> slot.
Use CSS custom properties (var(--color-primary), etc.) — never hardcode colours.
PAGE: {slug}
SCREENSHOT: [attached]
SECTIONS: [from content-map]
IMAGES FOR THIS PAGE:
Hero: /assets/images/farm-painting-hero.jpg (alt: Farm painting)
Section 2 (cards):
/assets/images/card1.jpg (alt: Barn painting)
/assets/images/card2.jpg (alt: Shed painting)
Section 3 (beforeAfter):
/assets/images/before.jpg (before)
/assets/images/after.jpg (after)
...
TEXT CONTENT: [from content-map text nodes]
Generate the page body. Use ONLY the images listed above in their correct sections.
Output per page:¶
---
import BaseLayout from "../layouts/BaseLayout.astro";
---
<BaseLayout title="Farm Painting" description="...">
<section class="hero">...</section>
<section class="services">...</section>
...
</BaseLayout>
Key constraint: Image mapping is EXPLICIT¶
The prompt doesn't give Claude a list of 150 images to pick from. It gives the exact images for each section, extracted from the content map. Claude's only job is to place them in the right HTML structure. This eliminates the "wrong image in wrong section" problem.
Phase 4: Verify¶
Goal: Automated QA with zero Claude calls. Catch issues before human review.
Step 4.1: Compile Check¶
Binary pass/fail. If any page fails to compile, log the error.Step 4.2: Screenshot Built Pages¶
// For each page, launch Playwright, navigate to built page, take screenshot
for (const slug of slugs) {
await page.goto(`http://localhost:4321/${slug}`);
await page.screenshot({ path: `qa/built/${slug}.png`, fullPage: true });
}
Step 4.3: Compare Screenshots¶
For each page, compare screenshots/desktop/{slug}.png (original) vs qa/built/{slug}.png (built):
- Structural similarity — are the section counts the same? Are they in the same order? (Compare section heights/positions)
- Image verification — are the correct images present in the built HTML? (Parse HTML, check src attributes against content-map)
- Colour check — extract dominant colours from built screenshot, compare to design tokens
- Content check — extract text from built HTML, compare word count to content-map text (catch missing content)
Output: QA Report¶
{
"homepage": { "compile": true, "imageAccuracy": 0.95, "contentCoverage": 0.88, "colourMatch": true, "pass": true },
"farm-painting": { "compile": true, "imageAccuracy": 0.60, "contentCoverage": 0.92, "colourMatch": true, "pass": false },
...
}
Pages scoring below threshold (e.g. 80% on any metric) go to Phase 5.
Phase 5: Fix¶
Goal: Targeted fixes for failing pages only. Uses Claude Vision to compare original vs built.
Input per failing page:¶
- Original screenshot
- Built screenshot
- Current .astro code
- Specific failure reasons from QA report (e.g. "imageAccuracy: 60% — missing 3 images", "contentCoverage: 70% — missing FAQ section")
Prompt:¶
Here are TWO screenshots:
1. ORIGINAL (target) — what the page should look like
2. BUILT (current) — what we generated
The current code is below. Fix ONLY the specific issues listed:
- Missing images: [list]
- Missing section: FAQ accordion
- Image in wrong position: farm-repairs.jpg is in hero, should be in section 3
Output the COMPLETE fixed file.
Loop:¶
Phase 4 → Phase 5 → Phase 4 → Phase 5 → ...
Max 3 iterations. If still failing after 3, flag for human review.
Cost Model¶
| Phase | Claude Calls | Estimated Cost |
|---|---|---|
| Phase 1: Scrape | 0 | $0 |
| Phase 2: Foundation | 2-3 | ~$0.05 |
| Phase 3: Page builds | N pages | ~$0.02/page |
| Phase 4: Verify | 0 | $0 |
| Phase 5: Fixes | ~20% of N | ~$0.02/fix |
For a 24-page site: 3 + 24 + ~5 fixes = ~32 Claude calls ≈ $0.70
At scale (1800 sites, avg 15 pages): - Per site: ~$0.50 - Total: ~$900 over 12 months - Per month (150 sites): ~$75/mo
Implementation Order¶
Sprint 1: Pipeline Skeleton¶
-
build-v2.js— orchestrator that runs Phases 2-5 in sequence - Phase 2.1:
extractPalette()— deterministic colour extraction - Phase 2.4:
generateBaseLayout()— template generation (no AI) - Phase 3: Restructure page prompt to use BaseLayout + body-only output
Sprint 2: Shared Components¶
- Phase 2.2:
generateNav()— Claude call with strict constraints - Phase 2.3:
generateFooter()— Claude call with strict constraints - Human gate: deploy foundation for review before page builds
Sprint 3: Image Mapping¶
-
buildImageMap(slug, contentMap)— extracts per-section image list from content map - Integrate into Phase 3 prompt so each page gets explicit image instructions
Sprint 4: Automated QA¶
- Phase 4.1: Compile check
- Phase 4.2: Screenshot built pages with Playwright
- Phase 4.3: Compare original vs built (structural + image + colour + content)
- Phase 4 output: QA report JSON
Sprint 5: Fix Loop¶
- Phase 5: Claude Vision comparison with dual screenshots
- Loop controller: Phase 4 → 5 → 4 → 5 (max 3)
- Fallback: flag for human review
Key Lessons from v1¶
- Never generate nav/footer per page. Generate once, share everywhere.
- Never give Claude 150 images to pick from. Map images to sections explicitly.
- Never hardcode colours in page prompts. Use CSS variables from a single source.
- Always verify before deploying. Screenshot comparison catches issues cheaply.
- The scraper is not the bottleneck anymore. The build agent prompt engineering is.
- One-shot generation doesn't work. An iterative loop with verification is necessary.
- Most pages need 1 fix pass, not 3. Target the 20% that fail, don't re-run everything.
Scraper Gap: Multi-Nav Detection¶
The Waterford site has TWO <nav> elements (left + right of logo). Our scraper only captures the first <nav>. This needs fixing:
- Update
content.jsto capture ALL<nav>elements - Tag them with position info (left/right) or merge into a single nav tree
- Also capture nav items from
<header>that aren't inside<nav>(Wix sometimes puts links outside nav)
This is a Phase 1 fix that will improve all downstream phases.