Technical SEO — CuevasLab

⬣ URLs

Canonical URL structure.

URLs follow a predictable, SEO-optimized pattern. The default language (Spanish) uses a clean URL without an explicit locale segment. Other languages include the language code in the URL.

Spanish (default-implicit) /es/product/apple-iphone-15-1000012

English (explicit locale) /gb/en/product/apple-iphone-15-1000012

Category /es/category/smartphones-smartphones

Collection /es/collection/new-arrivals

Slug + SKU as identifier

Product URLs include the readable slug plus the variant SKU: /product/apple-iphone-15-1000012. This combines human readability with technical uniqueness. Each variant has its own code.

6 canonical country codes

Each language is mapped to a canonical country code: ES→ES, EN→GB, DE→DE, FR→FR, IT→IT, PT→PT. The country code always appears as the first URL segment.

Private pages excluded

The routes /account, /cart, /checkout, /order and /search carry noindex, nofollow. They add no SEO value and would only generate duplicate or thin content.

⬣ Headings

Heading hierarchy.

Each page type has a clear, semantic heading structure. A single <h1> per page, followed by <h2> for main sections and <h3> for subsections. No skipped levels.

Product page

h1 — Product name h2 — Description h2 — Related products h3 — Each product name

Category page

h1 — Category name h2 — Description h2 — Category products h3 — Each product name

Home

h1 — Hero title (CMS) h2 — Content sections h2 — Featured collections h3 — Each collection name

Static page (CMS)

h1 — Page title h2 — Content blocks h2 — FAQ (if present) h3 — Each question

⬣ Structured Data

7 JSON-LD schemas.

Each page type injects specific structured data. The schemas are generated with pure functions in src/lib/seo/schema-builders.ts, unit tested and rendered as Server Components. This ensures Google and other engines understand the content unambiguously.

Product

Complete schema with name, description, images, brand and multiple Offer entries (one per variant). Each Offer includes SKU, price, currency and stock status. The brand falls back to "CuevasLab Shop" when no metadata is present.

BreadcrumbList

Hierarchical navigation: Home → Category → Product. Numeric positions. Appears on product, category and collection pages. Helps Google understand the site structure.

CollectionPage + ItemList

On category and collection pages. CollectionPage with ItemList that includes each product as a ListItem with position, name and URL.

FAQPage

On static pages that include FAQ blocks from the CMS. Each question/answer is marked as Question + acceptedAnswer. Powers rich snippets in search results.

Organization

Brand data: name, logo, URL. Injected in the root layout so Google associates the entire site with the "CuevasLab Shop" entity.

WebSite + SearchAction

Enables the Google Sitelinks Searchbox: a search field directly in Google results. The target points to /es/search/{search_term_string}.

AggregateRating ✓

Implemented in v1.22–v1.23 alongside the product review system. Each PDP includes AggregateRating with ratingValue, reviewCount and bestRating. Google shows stars in search results when the product has reviews.

⬣ Hreflang

6 languages, perfect signaling.

Every public storefront page includes <link rel="alternate" hreflang="..."> tags for all 6 supported languages plus an x-default. This tells Google which version to show based on the user's language.

Example: product page

<link rel="alternate" hreflang="es" href="https://shop.cuevaslab.es/es/product/..." />
<link rel="alternate" hreflang="en" href="https://shop.cuevaslab.es/gb/en/product/..." />
<link rel="alternate" hreflang="de" href="https://shop.cuevaslab.es/de/de/product/..." />
<link rel="alternate" hreflang="fr" href="https://shop.cuevaslab.es/fr/fr/product/..." />
<link rel="alternate" hreflang="it" href="https://shop.cuevaslab.es/it/it/product/..." />
<link rel="alternate" hreflang="pt" href="https://shop.cuevaslab.es/pt/pt/product/..." />
<link rel="alternate" hreflang="x-default" href="https://shop.cuevaslab.es/es/product/..." />

Server Component

The hreflang tags are rendered as a Server Component (hreflang-links.tsx), not as Next.js metadata. This gives full control over the output and makes debugging easier.

x-default points to Spanish

The x-default fallback always points to the Spanish version, which is the storefront's primary language and has the cleanest URLs (no locale segment).

⬣ Sitemap

Sitemap Index per language.

Instead of a single massive sitemap, we use a Sitemap Index that generates a sub-sitemap per language. This respects the 50,000 URL limit per sitemap and allows Google to crawl each language independently.

Sitemap structure

/sitemap.xml              ← Sitemap Index (auto-generated by Next.js)
  /sitemap/es.xml         ← Spanish: pages + products + categories + collections
  /sitemap/gb.xml         ← English
  /sitemap/de.xml         ← German
  /sitemap/fr.xml         ← French
  /sitemap/it.xml         ← Italian
  /sitemap/pt.xml         ← Portuguese

ISR with 24h revalidation

Sitemaps are regenerated via ISR every 24 hours. Enough for a catalog that does not change every minute. On-demand revalidation via Medusa webhook is in the backlog.

Deduplication by handle

If Medusa returns duplicate products (same handle, different ID), the sitemap deduplicates them automatically with a Set. Only the first occurrence enters the XML.

Real lastModified

Products and collections use their real updated_at from Medusa. Static pages use deterministic dates (not new Date()) to avoid unnecessary regenerations.

⬣ Meta Tags

Dynamic metadata per page.

Each page type generates its own metadata with Next.js generateMetadata(): title, description, canonical, Open Graph and Twitter Cards. OG images are served from Cloudinary to optimize LCP.

Page Title OG Image Robots

Product Product name — CuevasLab Shop Product image (Cloudinary) index, follow

Category Category name — CuevasLab Shop Category image index, follow

Collection Collection name — CuevasLab Shop Collection image index, follow

Home CuevasLab Shop — Your online store Generic OG index, follow

Cart / Checkout Cart — CuevasLab Shop — noindex, nofollow

Account My account — CuevasLab Shop — noindex, nofollow

Search Search — CuevasLab Shop — noindex, nofollow

⬣ Bot Detection

Bots see the page, not the modal.

The storefront has a GeoModal that appears on first visit to select country and language. But Google, Bing, Facebook and other bots must see the content directly, without interference. The solution: bot detection in middleware.

15+ user agents detected

Googlebot, Google-InspectionTool, Bingbot, DuckDuckBot, Baiduspider, Yandexbot, Slurp, FacebookExternalHit, Twitterbot, LinkedInBot, SemrushBot, AhrefsBot, AppleBot, MJ12Bot, PetalBot, ByteSpider.

Middleware → header → layout

The middleware analyzes the User-Agent and sets a header x-is-bot: 1. The root layout reads that header and conditionally renders the GeoModal: {!isBotRequest && <GeoModal />}.

Unit tested

7 tests cover: normal browsers (Chrome, Safari, Firefox) return false, all known bots return true, and edge cases (null, undefined, empty string) are handled correctly.

⬣ Robots.txt

What gets indexed, what does not.

The storefront robots.txt is generated dynamically via Next.js (src/app/robots.ts). Private routes and APIs are excluded from crawling.

Storefront robots.txt

User-Agent: *
Allow: /
Disallow: /api/
Disallow: /account/
Disallow: /cart
Disallow: /checkout
Disallow: /page/design-system

Sitemap: https://shop.cuevaslab.es/sitemap.xml

⬣ Lighthouse CI

Metrics on every deploy.

Lighthouse runs in CI after every deployment, both staging and production. Results are stored in a GitHub Gist and visualized in the deployment dashboard with historical trends per release.

Metric Threshold Level

Performance ≥ 0.85 warn

Accessibility ≥ 0.90 warn

Best Practices ≥ 0.90 warn

SEO ≥ 0.92 warn

LCP ≤ 2500ms warn

TBT ≤ 200ms warn

FCP ≤ 1800ms warn

The thresholds are set to "warn" instead of "error" deliberately. Lighthouse CI on GitHub Actions runners does not reflect real user conditions — the values serve to detect regressions, not as absolute gates. The only exception is CLS, which is an error because it measures layout shifts independently of runner speed.

⬣ Evolution

How we got here.

SEO was not implemented all at once. It was an iterative process in three waves, each building on the previous one.

v1.6.0 — Foundations

First implementation: dynamic sitemap, robots.txt, basic hreflang, canonical URLs, Open Graph tags. The foundation to build upon.

v1.10.0 — SEO Wave 1: Schemas

JSON-LD schemas (Product, BreadcrumbList), noindex on private pages, dynamic OG images, per-page meta descriptions. Lighthouse CI integrated into the pipeline.

v1.18.0 — Consolidation

Consolidation sprint: 7 security fixes, WCAG AA, CSP headers, LCP hero optimized with Cloudinary preconnect. SEO benefits from the overall quality improvement.

v1.19.0 — SEO Audit v2

Sitemap refactored to Sitemap Index per language. URLs with variant SKU. GeoModal hidden for bots (15+ user agents). WebSite + SearchAction schema. CollectionPage + FAQPage schemas. 14 new unit tests.

⬣ GSO

From SEO to GSO. The future is here.

Traditional SEO optimizes for Google Search: it ranks your page in the 10 blue links. GSO (Generative Search Optimization) is the evolution: optimizing so your content appears inside AI-generated answers — ChatGPT, Gemini, Copilot, Perplexity. It is no longer enough to rank: you need the AI to cite you.

The paradigm shift

In classic SEO you compete for clicks. In GSO you compete for citations. AI engines crawl your content, digest it, and include it (or not) in their answers. If your content is not citable — clear, structured, authoritative — the AI ignores it and cites someone else.

Structured data as the AI's language

The 7 JSON-LD schemas we already have are not just for Google — they are the language LLMs use to understand your content. Product, FAQ, BreadcrumbList, AggregateRating... each schema is a hint telling the AI what your page is and why it is relevant.

Citable content

Short paragraphs with clear assertions. Lists with concrete data. Explicit questions and answers (FAQ). Demonstrable authority (real experience, not generic content). CuevasLab content is written this way on purpose — not just for humans, but also for AIs.

Backlinks as authority signal

LLMs still use backlinks as a trust signal. A link from a high-authority site tells the AI your content is reliable. We are working on an organic backlink strategy: community contributions, technical articles, and presence in relevant forums.

Since this is a demo store without real products indexed by Google, we can't show actual GSO results. These screenshots illustrate how a conversational AI would reference our structured data in practice.

GSO query

ChatGPT conversation asking about cheap smartphones with fast delivery, referencing CuevasLab

GSO response

ChatGPT response listing CuevasLab products with direct links and pricing details

⬣ KPIs

What gets measured, gets improved.

There is no point implementing technical SEO if you do not measure the impact. These are the KPIs we monitor and the tools we use for each.

KPI Tool Frequency

Lighthouse scores Lighthouse CI + Dashboard Every deploy

Core Web Vitals Google Search Console Daily

Indexation Search Console + sitemap Weekly

Rich results Search Console + Rich Results Test Weekly

Organic traffic GA4 Weekly

Heatmaps + UX Microsoft Clarity Continuous

The goal is not 100 in Lighthouse

The goal is to detect regressions. If the Performance score drops 10 points after a release, something went wrong. Absolute numbers matter less than the trend. That is why Lighthouse CI thresholds are "warn" and not "error" — they alert, they do not block.

In progress: SEO dashboard

I am working on integrating Search Console and GA4 data into a dashboard similar to the deployment one — with historical trends of impressions, clicks, CTR, and average position per page. The goal is to have the same visibility over SEO that we already have over performance.

⬣ Tests

SEO tested, not assumed.

All SEO code has unit tests. The schemas, sitemap and bot detection do not rely on "it seems to work" — they rely on tests that run in CI on every push.

Schema builders (7 tests)

Validate that Product generates correct Offers with SKU/price/currency, that BreadcrumbList has correct positions, that CollectionPage includes ItemList, and that the brand fallback works.

Sitemap (7 tests)

Verify ID generation per language, CC-to-lang mapping, canonical URLs with slug-SKU, localized slugs from metadata, category URLs with slug-handle, and deduplication by handle.

Bot detection (7 tests)

Confirm that normal browsers are not detected as bots, that all known crawlers are detected, and that null or empty values are handled without errors.