SEO that is measured on every deploy.
The storefront SEO is not a one-time checklist. It is a technical strategy integrated into the CI/CD pipeline: every deploy validates schemas, runs Lighthouse, and reports metrics. Here is everything we have implemented and why.
Canonical URL structure.
URLs follow a predictable, SEO-optimized pattern. The default language (Spanish) uses a clean URL without an explicit locale segment. Other languages include the language code in the URL.
/es/product/apple-iphone-15-1000012
/gb/en/product/apple-iphone-15-1000012
/es/category/smartphones-smartphones
/es/collection/new-arrivals
Slug + SKU as identifier
Product URLs include the readable slug plus the variant SKU: /product/apple-iphone-15-1000012. This combines human readability with technical uniqueness. Each variant has its own code.
6 canonical country codes
Each language is mapped to a canonical country code: ES→ES, EN→GB, DE→DE, FR→FR, IT→IT, PT→PT. The country code always appears as the first URL segment.
Private pages excluded
The routes /account, /cart, /checkout, /order and /search carry noindex, nofollow. They add no SEO value and would only generate duplicate or thin content.
Heading hierarchy.
Each page type has a clear, semantic heading structure. A single <h1> per page, followed by <h2> for main sections and <h3> for subsections. No skipped levels.
7 JSON-LD schemas.
Each page type injects specific structured data. The schemas are generated with pure functions in src/lib/seo/schema-builders.ts, unit tested and rendered as Server Components. This ensures Google and other engines understand the content unambiguously.
Product
Complete schema with name, description, images, brand and multiple Offer entries (one per variant). Each Offer includes SKU, price, currency and stock status. The brand falls back to "CuevasLab Shop" when no metadata is present.
BreadcrumbList
Hierarchical navigation: Home → Category → Product. Numeric positions. Appears on product, category and collection pages. Helps Google understand the site structure.
CollectionPage + ItemList
On category and collection pages. CollectionPage with ItemList that includes each product as a ListItem with position, name and URL.
FAQPage
On static pages that include FAQ blocks from the CMS. Each question/answer is marked as Question + acceptedAnswer. Powers rich snippets in search results.
Organization
Brand data: name, logo, URL. Injected in the root layout so Google associates the entire site with the "CuevasLab Shop" entity.
WebSite + SearchAction
Enables the Google Sitelinks Searchbox: a search field directly in Google results. The target points to /es/search/{search_term_string}.
AggregateRating ✓
Implemented in v1.22–v1.23 alongside the product review system. Each PDP includes AggregateRating with ratingValue, reviewCount and bestRating. Google shows stars in search results when the product has reviews.
6 languages, perfect signaling.
Every public storefront page includes <link rel="alternate" hreflang="..."> tags for all 6 supported languages plus an x-default. This tells Google which version to show based on the user's language.
<link rel="alternate" hreflang="es" href="https://shop.cuevaslab.es/es/product/..." />
<link rel="alternate" hreflang="en" href="https://shop.cuevaslab.es/gb/en/product/..." />
<link rel="alternate" hreflang="de" href="https://shop.cuevaslab.es/de/de/product/..." />
<link rel="alternate" hreflang="fr" href="https://shop.cuevaslab.es/fr/fr/product/..." />
<link rel="alternate" hreflang="it" href="https://shop.cuevaslab.es/it/it/product/..." />
<link rel="alternate" hreflang="pt" href="https://shop.cuevaslab.es/pt/pt/product/..." />
<link rel="alternate" hreflang="x-default" href="https://shop.cuevaslab.es/es/product/..." />
Server Component
The hreflang tags are rendered as a Server Component (hreflang-links.tsx), not as Next.js metadata. This gives full control over the output and makes debugging easier.
x-default points to Spanish
The x-default fallback always points to the Spanish version, which is the storefront's primary language and has the cleanest URLs (no locale segment).
Sitemap Index per language.
Instead of a single massive sitemap, we use a Sitemap Index that generates a sub-sitemap per language. This respects the 50,000 URL limit per sitemap and allows Google to crawl each language independently.
/sitemap.xml ← Sitemap Index (auto-generated by Next.js)
/sitemap/es.xml ← Spanish: pages + products + categories + collections
/sitemap/gb.xml ← English
/sitemap/de.xml ← German
/sitemap/fr.xml ← French
/sitemap/it.xml ← Italian
/sitemap/pt.xml ← Portuguese
ISR with 24h revalidation
Sitemaps are regenerated via ISR every 24 hours. Enough for a catalog that does not change every minute. On-demand revalidation via Medusa webhook is in the backlog.
Deduplication by handle
If Medusa returns duplicate products (same handle, different ID), the sitemap deduplicates them automatically with a Set. Only the first occurrence enters the XML.
Real lastModified
Products and collections use their real updated_at from Medusa. Static pages use deterministic dates (not new Date()) to avoid unnecessary regenerations.
Dynamic metadata per page.
Each page type generates its own metadata with Next.js generateMetadata(): title, description, canonical, Open Graph and Twitter Cards. OG images are served from Cloudinary to optimize LCP.
Bots see the page, not the modal.
The storefront has a GeoModal that appears on first visit to select country and language. But Google, Bing, Facebook and other bots must see the content directly, without interference. The solution: bot detection in middleware.
15+ user agents detected
Googlebot, Google-InspectionTool, Bingbot, DuckDuckBot, Baiduspider, Yandexbot, Slurp, FacebookExternalHit, Twitterbot, LinkedInBot, SemrushBot, AhrefsBot, AppleBot, MJ12Bot, PetalBot, ByteSpider.
Middleware → header → layout
The middleware analyzes the User-Agent and sets a header x-is-bot: 1. The root layout reads that header and conditionally renders the GeoModal: {!isBotRequest && <GeoModal />}.
Unit tested
7 tests cover: normal browsers (Chrome, Safari, Firefox) return false, all known bots return true, and edge cases (null, undefined, empty string) are handled correctly.
What gets indexed, what does not.
The storefront robots.txt is generated dynamically via Next.js (src/app/robots.ts). Private routes and APIs are excluded from crawling.
User-Agent: *
Allow: /
Disallow: /api/
Disallow: /account/
Disallow: /cart
Disallow: /checkout
Disallow: /page/design-system
Sitemap: https://shop.cuevaslab.es/sitemap.xml
Metrics on every deploy.
Lighthouse runs in CI after every deployment, both staging and production. Results are stored in a GitHub Gist and visualized in the deployment dashboard with historical trends per release.
The thresholds are set to "warn" instead of "error" deliberately. Lighthouse CI on GitHub Actions runners does not reflect real user conditions — the values serve to detect regressions, not as absolute gates. The only exception is CLS, which is an error because it measures layout shifts independently of runner speed.
How we got here.
SEO was not implemented all at once. It was an iterative process in three waves, each building on the previous one.
v1.6.0 — Foundations
First implementation: dynamic sitemap, robots.txt, basic hreflang, canonical URLs, Open Graph tags. The foundation to build upon.
v1.10.0 — SEO Wave 1: Schemas
JSON-LD schemas (Product, BreadcrumbList), noindex on private pages, dynamic OG images, per-page meta descriptions. Lighthouse CI integrated into the pipeline.
v1.18.0 — Consolidation
Consolidation sprint: 7 security fixes, WCAG AA, CSP headers, LCP hero optimized with Cloudinary preconnect. SEO benefits from the overall quality improvement.
v1.19.0 — SEO Audit v2
Sitemap refactored to Sitemap Index per language. URLs with variant SKU. GeoModal hidden for bots (15+ user agents). WebSite + SearchAction schema. CollectionPage + FAQPage schemas. 14 new unit tests.
From SEO to GSO. The future is here.
Traditional SEO optimizes for Google Search: it ranks your page in the 10 blue links. GSO (Generative Search Optimization) is the evolution: optimizing so your content appears inside AI-generated answers — ChatGPT, Gemini, Copilot, Perplexity. It is no longer enough to rank: you need the AI to cite you.
The paradigm shift
In classic SEO you compete for clicks. In GSO you compete for citations. AI engines crawl your content, digest it, and include it (or not) in their answers. If your content is not citable — clear, structured, authoritative — the AI ignores it and cites someone else.
Structured data as the AI's language
The 7 JSON-LD schemas we already have are not just for Google — they are the language LLMs use to understand your content. Product, FAQ, BreadcrumbList, AggregateRating... each schema is a hint telling the AI what your page is and why it is relevant.
Citable content
Short paragraphs with clear assertions. Lists with concrete data. Explicit questions and answers (FAQ). Demonstrable authority (real experience, not generic content). CuevasLab content is written this way on purpose — not just for humans, but also for AIs.
Backlinks as authority signal
LLMs still use backlinks as a trust signal. A link from a high-authority site tells the AI your content is reliable. We are working on an organic backlink strategy: community contributions, technical articles, and presence in relevant forums.
Since this is a demo store without real products indexed by Google, we can't show actual GSO results. These screenshots illustrate how a conversational AI would reference our structured data in practice.
What gets measured, gets improved.
There is no point implementing technical SEO if you do not measure the impact. These are the KPIs we monitor and the tools we use for each.
The goal is not 100 in Lighthouse
The goal is to detect regressions. If the Performance score drops 10 points after a release, something went wrong. Absolute numbers matter less than the trend. That is why Lighthouse CI thresholds are "warn" and not "error" — they alert, they do not block.
In progress: SEO dashboard
I am working on integrating Search Console and GA4 data into a dashboard similar to the deployment one — with historical trends of impressions, clicks, CTR, and average position per page. The goal is to have the same visibility over SEO that we already have over performance.
SEO tested, not assumed.
All SEO code has unit tests. The schemas, sitemap and bot detection do not rely on "it seems to work" — they rely on tests that run in CI on every push.
Schema builders (7 tests)
Validate that Product generates correct Offers with SKU/price/currency, that BreadcrumbList has correct positions, that CollectionPage includes ItemList, and that the brand fallback works.
Sitemap (7 tests)
Verify ID generation per language, CC-to-lang mapping, canonical URLs with slug-SKU, localized slugs from metadata, category URLs with slug-handle, and deduplication by handle.
Bot detection (7 tests)
Confirm that normal browsers are not detected as bots, that all known crawlers are detected, and that null or empty values are handled without errors.
Let's talk
Drop me a note — questions, feedback, or just want to say hi.