Guides

Building SEO for Single-Page Apps with open-source tools

This guide provides a technical roadmap for ensuring single-page applications (SPAs) are fully indexable by search engines. It addresses the 'empty shell' problem where crawlers receive a root div before JavaScript execution, ensuring that content, metadata, and status codes are correctly processed by Googlebot and other search engines.

4-6 hours7 steps

Audit Initial Indexing and Rendering

Before implementing fixes, use Google Search Console's URL Inspection tool to see how Google currently renders your SPA. Compare the 'Crawled Page' (HTML source) with the 'Live Test' screenshot. If the screenshot is blank or missing key content, your client-side hydration is failing or timing out for the crawler.

⚠ Common Pitfalls

•Relying solely on the 'view source' in the browser which doesn't execute JS.
•Ignoring the 'View Tested Page' tab in GSC which shows console errors encountered by Googlebot.

Implement Dynamic Meta Tags

Inject unique titles and meta descriptions for every route. In React, use react-helmet-async; in Vue, use vue-meta. Ensure these tags are updated as soon as the route changes but before the component fully mounts if possible.

ProductPage.jsx

import { Helmet } from 'react-helmet-async';

const ProductPage = ({ product }) => (
  <>
    <Helmet>
      <title>{product.name} | My Store</title>
      <meta name="description" content={product.description} />
      <link rel="canonical" href={`https://example.com/p/${product.id}`} />
    </Helmet>
    <h1>{product.name}</h1>
  </>
);

⚠ Common Pitfalls

•Forgetting to wrap the app in HelmetProvider.
•Using document.title directly which may not be picked up by older crawlers.

Configure Dynamic Rendering (Prerendering)

For existing SPAs where migrating to Next.js or Nuxt is too costly, implement dynamic rendering. Configure your web server to detect bot User-Agents (Googlebot, Bingbot) and proxy those requests to a service like Prerender.io or a self-hosted Rendertron instance.

nginx.conf

location / {
  set $prerender 0;
  if ($http_user_agent ~* "googlebot|bingbot|baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest\/0\.|developers.google.com\/\+\/web\/snippet") {
    set $prerender 1;
  }

  if ($prerender = 1) {
    rewrite .* /https://$host$request_uri break;
    proxy_pass http://service.prerender.io;
  }

  try_files $uri $uri/ /index.html;
}

⚠ Common Pitfalls

•Caching the prerendered HTML for too long, leading to stale content in SERPs.
•Failing to whitelist your own API calls within the prerenderer.

Convert Hash Routing to History API

Search engines traditionally ignore anything after the '#' in a URL. Ensure your SPA uses the HTML5 History API (BrowserRouter in React Router) so each view has a unique, clean URL path (e.g., /products/123 instead of /#/products/123).

⚠ Common Pitfalls

•Forgetting to configure the server to fallback to index.html for all sub-routes, causing 404s on page refresh.

Handle 404 Status Codes Programmatically

SPAs usually return a 200 OK status even for non-existent routes because the server serves index.html. To fix this, you must either use a server-side middleware to check route validity or use a meta tag 'fragment' that signals to crawlers that the page is a 404.

NotFound.html

<!-- In your 404 component template -->
<meta name="prerender-status-code" content="404">

⚠ Common Pitfalls

•Allowing 'Soft 404s' where a 'Page Not Found' message is displayed but the HTTP status code remains 200.

Generate and Automate XML Sitemaps

Since crawlers may struggle to discover all links in a complex JS-driven UI, provide a static XML sitemap. Create a script that fetches your product/post slugs from your API and writes a sitemap.xml to your public folder during the build process.

scripts/sitemap-gen.js

const fs = require('fs');
const axios = require('axios');

async function generateSitemap() {
  const resp = await axios.get('https://api.example.com/posts');
  const xml = `<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      ${resp.data.map(post => `<url><loc>https://example.com/blog/${post.slug}</loc></url>`).join('')}
    </urlset>`;
  fs.writeFileSync('./public/sitemap.xml', xml);
}

generateSitemap();

Validate with Schema.org Structured Data

Help crawlers understand your content without relying on text parsing. Inject JSON-LD structured data into the head of your SPA. This is especially critical for SPAs to ensure Google identifies products, articles, or breadcrumbs correctly despite the client-side rendering delay.

StructuredData.jsx

const structuredData = {
  "@context": "https://schema.org",
  "@type": "Product",
  "name": product.name,
  "description": product.description
};

<script type="application/ld+json">
  {JSON.stringify(structuredData)}
</script>

What you built

Optimizing an SPA for SEO requires moving beyond client-side logic and managing how the server and crawlers interact. By implementing dynamic rendering, clean URLs, and proper metadata management, you can achieve the performance benefits of an SPA without sacrificing organic search visibility. Regularly monitor the 'Coverage' report in Google Search Console to catch rendering regressions early.