Skip to content

Investigation Plan: AWS Hosting + BizSite Replacement

Date: 2026-04-06 Author: Cathal Dempsey + Claude Status: Proposed — awaiting Phase 1 kickoff


Background

Current State

  • Replatform pipeline: Scrapes Wix/Mono/WP sites → Claude builds Astro pages → deploys to Cloudflare Workers/Pages
  • BizSite: Internal Node.js tool that pulls content from Yext/Saymore → renders JS templates to flat HTML → deploys to S3 bucket
  • Hosting: Split across Cloudflare (replatform) + AWS (BizSite + API server)

Problems

  1. Two separate systems doing similar things (generate static sites from content)
  2. Hosting split across two providers — two failure points
  3. Cloudflare has a 100 project/account hard cap — need 18 accounts for 1800 sites
  4. BizSite is a separate codebase to maintain

Proposed End State

One unified pipeline that can source content from either scraping (Wix migrations) or Yext API (existing BizSite clients), builds Astro sites, deploys to S3+CloudFront. BizSite retired.

Scale

  • 1800 sites total
  • 40 staff
  • 150 sites/month migration pace (12-month Wix contract cycle)
  • Build cost: ~$0.30/site (Claude API)

Phase 1: Audit BizSite + Yext Integration

Goal: Understand what BizSite does so we know what to replicate.

  • Get access to BizSite repo, read the Node.js codebase
  • Map the Yext data model — what fields are pulled? (name, address, hours, services, photos, reviews, social links, etc.)
  • Document the Yext/Saymore API endpoints and auth (API key? OAuth?)
  • Catalogue the BizSite JS templates — how many, what page types?
  • Identify which of the 1800 clients currently have Yext data vs which are Wix-only
  • Understand the current S3 deploy — bucket structure, CloudFront setup, custom domains, SSL

Output: A mapping doc showing BizSite field → Yext field → equivalent in our content-map.json schema


Phase 2: Add Yext as a Content Source

Goal: The replatform pipeline currently only scrapes. Add Yext API as an alternative content source that produces the same content-map.json format.

  • Build lib/yext-source.js — pulls a client's Yext listing, transforms to our content-map schema
  • Map Yext fields to our node types:
Yext Field Content Map Node
name heading
address, phone text / contact section
hours structured data / text
description text nodes
photos image nodes
logo header image
socialProfiles socialLinks array
reviews reviews array
services cards / list
categories nav items
  • Add a --source yext --yext-id <entity-id> flag to crawl.js so the pipeline can be triggered either way
  • Test: generate content-map.json from Yext for a client that currently has a BizSite, compare output

Output: Same content-map.json format whether content comes from scraping or Yext. Build agent doesn't need to know the difference.


Phase 3: Migrate Hosting to S3 + CloudFront

Goal: Replace Cloudflare Workers/Pages with AWS. Single account, no project cap.

Architecture

[Astro Build Output]
        |
        v
  S3 Bucket: sites.fcrweb.ie
    /waterfordcountypainters.ie/
      index.html
      assets/
      farm-painting/index.html
      ...
    /trimtech.ie/
      index.html
      ...
        |
        v
  CloudFront Distribution
    - Custom domain per client (CNAME)
    - ACM SSL cert (free)
    - Origin: S3 bucket with path prefix

Tasks

  • Audit BizSite's existing S3 bucket structure — can we reuse it or need a new setup?
  • Design bucket layout: single bucket with /{domain}/ prefix per client
  • Set up CloudFront distribution with per-client custom domains
  • SSL via ACM (free) — wildcard cert for *.fcrweb.ie + per-client custom domain certs
  • Build lib/deploy-s3.js to replace wrangler pages deploy:
    aws s3 sync build/ s3://sites/{domain}/ --delete
    aws cloudfront create-invalidation --distribution-id $DIST_ID --paths "/*"
    
  • DNS: clients CNAME their domain to CloudFront distribution
  • Test: deploy one Astro build to S3+CloudFront, verify it serves correctly

Output: node deploy.js waterfordcountypainters.ie deploys to S3+CloudFront instead of Cloudflare


Phase 4: Unified Pipeline

Goal: One command, two content sources, one hosting target.

CLI Interface

# Wix migration (scrape + build + deploy)
node orchestrate.js --url https://www.example.ie --deploy s3

# Yext client (API + build + deploy)
node orchestrate.js --source yext --yext-id 12345 --deploy s3

# BizSite replacement (bulk migrate existing BizSite clients)
node migrate-bizsite.js --all

Tasks

  • Update orchestrate.js to support --source yext flag
  • Build migrate-bizsite.js — iterates existing BizSite clients, pulls from Yext, builds Astro, deploys to same S3 bucket (in-place replacement)
  • Update dashboard to show content source (scraped vs Yext) per project
  • QA: side-by-side compare BizSite output vs Astro output for 5 clients

Phase 5: Retire BizSite

  • Migrate all BizSite clients to Astro builds (batched over weeks)
  • Verify no regressions — QA agent compares old BizSite vs new Astro for each client
  • Decommission BizSite codebase
  • Remove Cloudflare account(s) if no longer needed
  • Update staff training/docs for new pipeline

Cost Analysis

Hosting

Cloudflare (current) AWS S3+CloudFront (proposed)
1800 sites $0 but 18 accounts ~$200-400/mo
Per-site cost $0 ~$0.15/mo
Project cap 100/account Unlimited
Accounts needed 18 1
Static bandwidth Unlimited ~$0.085/GB (first 10TB)
SSL Free, auto Free via ACM
Admin overhead High (18 accounts) Low (1 account)

Build Pipeline

Item Cost
Claude API (builds) ~$0.30/site × 150/month = ~$45/mo
EC2 (API + scraper) Already running
Yext API Already paying

Total at Scale

Item Monthly
AWS hosting (1800 sites) ~$300
Claude API (150 builds/mo) ~$45
EC2 Existing
Total ~$345/mo

vs maintaining two systems (BizSite + replatform) across two providers


Key Questions to Resolve

  1. How many clients are on Yext today? Determines size of Phase 2 effort
  2. Is the BizSite S3 bucket + CloudFront setup reusable? Could skip most of Phase 3
  3. Does Yext have rate limits? Matters for bulk migration
  4. Do BizSite clients have custom domains on CloudFront already? If yes, migration is just swapping HTML files — zero DNS changes
  5. Can we get read access to the BizSite repo? Needed for Phase 1
  6. Are there BizSite features beyond content rendering? (analytics, tracking, integrations we'd need to replicate)

Timeline Estimate

Phase Duration Dependencies
Phase 1: Audit 1 week BizSite repo access
Phase 2: Yext source 1-2 weeks Phase 1 complete
Phase 3: S3+CF hosting 1 week Can run parallel to Phase 2
Phase 4: Unified pipeline 1 week Phases 2+3 complete
Phase 5: BizSite retirement 4-8 weeks Phased migration