Guides

Building Technical SEO for Web Apps with open-source tools

This guide provides a technical implementation path for optimizing modern web applications for search engines, focusing on Next.js and React-based architectures. It covers the transition from client-side rendering to server-side rendering (SSR) or incremental static regeneration (ISR), ensuring that crawlers receive fully rendered content and structured metadata without execution delays.

4-6 hours5 steps
1

Implement Dynamic Metadata via generateMetadata

Replace static meta tags with the Next.js Metadata API to ensure unique titles, descriptions, and canonical URLs for every dynamic route. This prevents duplicate content issues and ensures the crawler receives the correct signals during the initial HTTP request.

app/products/[id]/page.tsx
import { Metadata } from 'next';

type Props = { params: { id: string } };

export async function generateMetadata({ params }: Props): Promise<Metadata> {
  const product = await fetch(`https://api.example.com/products/${params.id}`).then((res) => res.json());

  return {
    title: product.name,
    description: product.description,
    alternates: {
      canonical: `https://example.com/products/${params.id}`,
    },
    openGraph: {
      images: [product.image],
    },
  };
}

⚠ Common Pitfalls

  • Hardcoding absolute URLs instead of using environment variables for different stages
  • Forgetting to include a canonical tag, leading to URL parameter-based duplication
2

Inject JSON-LD Structured Data

Embed Schema.org markup directly into the page using a script tag with type 'application/ld+json'. This allows search engines to parse product details, reviews, or organizational info without executing complex JavaScript logic.

components/ProductJsonLd.tsx
export default function ProductPage({ product }) {
  const jsonLd = {
    '@context': 'https://schema.org',
    '@type': 'Product',
    name: product.name,
    image: product.image,
    description: product.description,
    offers: {
      '@type': 'Offer',
      price: product.price,
      priceCurrency: 'USD',
      availability: 'https://schema.org/InStock',
    },
  };

  return (
    <section>
      <script
        type="application/ld+json"
        dangerouslySetInnerHTML={{ __html: JSON.stringify(jsonLd) }}
      />
      <h1>{product.name}</h1>
    </section>
  );
}

⚠ Common Pitfalls

  • Inconsistent data between the visible UI and the JSON-LD payload, which can trigger search quality flags
  • Invalid nesting of schema objects causing parsing errors in Google Search Console
3

Optimize Cumulative Layout Shift (CLS) for Web Vitals

Core Web Vitals are ranking factors. Use the 'next/image' component to ensure images have pre-calculated aspect ratios, preventing layout shifts as images load. Set explicit dimensions or use 'fill' with a defined aspect-ratio container.

components/Hero.tsx
import Image from 'next/image';

export default function Hero() {
  return (
    <div className="aspect-video relative w-full">
      <Image
        src="/hero.jpg"
        alt="Product Hero"
        fill
        priority
        sizes="(max-width: 768px) 100vw, 50vw"
        className="object-cover"
      />
    </div>
  );
}

⚠ Common Pitfalls

  • Using 'priority' on every image instead of only those above the fold
  • Neglecting to set a fallback height for dynamic ad slots or third-party widgets
4

Automate Sitemap Generation for Dynamic Routes

Create a sitemap.ts file to dynamically generate your sitemap.xml. This ensures that new content is discoverable by crawlers immediately after creation without manual updates.

app/sitemap.ts
import { MetadataRoute } from 'next';

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const products = await fetch('https://api.example.com/products').then(res => res.json());
  
  const productEntries = products.map((product: any) => ({
    url: `https://example.com/products/${product.id}`,
    lastModified: new Date(product.updatedAt),
    changeFrequency: 'daily',
    priority: 0.7,
  }));

  return [
    {
      url: 'https://example.com',
      lastModified: new Date(),
      changeFrequency: 'yearly',
      priority: 1,
    },
    ...productEntries,
  ];
}

⚠ Common Pitfalls

  • Exceeding the 50,000 URL limit per sitemap file without implementing a sitemap index
  • Including private, noindex, or broken URLs in the sitemap
5

Configure robots.txt and Crawl Directives

Explicitly define which paths should be ignored by crawlers to save crawl budget. Use the robots.ts file to manage access to internal search pages, admin panels, or API endpoints.

app/robots.ts
import { MetadataRoute } from 'next';

export default function robots(): MetadataRoute.Robots {
  return {
    rules: {
      userAgent: '*',
      allow: '/',
      disallow: ['/admin/', '/api/', '/search?'],
    },
    sitemap: 'https://example.com/sitemap.xml',
  };
}

⚠ Common Pitfalls

  • Accidentally blocking CSS or JS assets required for rendering, leading to partial rendering issues
  • Using robots.txt to try and remove a page from the index (use 'noindex' meta tags instead)

What you built

Following these steps ensures that your web application provides a clean, fast, and structured interface for search engine crawlers. Success should be verified by monitoring the 'Indexing' and 'Experience' reports in Google Search Console to ensure all dynamic pages are correctly discovered and meet the Core Web Vitals thresholds.