Guides

Building Nuxt with open-source tools

This guide outlines the technical implementation of a Nuxt 3 application optimized for edge runtimes, focusing on streaming AI responses and minimizing cold-start latency through efficient server-side configuration and bundle management.

45 minutes5 steps
1

Configure Nitro for Edge Runtimes

To leverage global low-latency, you must explicitly configure Nitro to use edge presets. This ensures that the generated build uses the correct environment variables and runtime APIs (like Web Streams) available on providers like Vercel Edge or Cloudflare Workers.

nuxt.config.ts
export default defineNuxtConfig({
  nitro: {
    preset: 'vercel-edge',
  },
  runtimeConfig: {
    openaiApiKey: process.env.OPENAI_API_KEY
  }
})

⚠ Common Pitfalls

  • Using Node.js specific modules (e.g., 'fs' or 'path') will cause build failures on edge runtimes.
  • Ensure all environment variables are prefixed correctly in your deployment dashboard.
2

Implement Streaming AI Server Routes

Create a server-side route using the Vercel AI SDK to handle streaming responses. This prevents the request from timing out on long-running AI generations and provides a better user experience through partial data delivery.

server/api/chat.ts
import { OpenAIStream, StreamingTextResponse } from 'ai'
import OpenAI from 'openai'

const openai = new OpenAI({ apiKey: useRuntimeConfig().openaiApiKey })

export default defineEventHandler(async (event) => {
  const { messages } = await readBody(event)
  const response = await openai.chat.completions.create({
    model: 'gpt-3.5-turbo',
    stream: true,
    messages,
  })
  const stream = OpenAIStream(response)
  return new StreamingTextResponse(stream)
})

⚠ Common Pitfalls

  • Forgetting to return a StreamingTextResponse will result in the client waiting for the full payload, negating streaming benefits.
  • Edge functions often have a 10-30 second execution limit; ensure your prompt logic fits within these bounds.
3

Integrate useChat for Client-Side Consumption

Use the Vercel AI SDK's Vue hooks to manage the chat state. This avoids manual handling of readable streams and automatically updates the UI as chunks arrive from the server.

pages/index.vue
<script setup>
import { useChat } from 'ai/vue'
const { messages, input, handleSubmit } = useChat()
</script>

<template>
  <div v-for="m in messages" :key="m.id">{{ m.content }}</div>
  <form @submit="handleSubmit">
    <input v-model="input" placeholder="Ask something..." />
  </form>
</template>

⚠ Common Pitfalls

  • Ensure you are using the '/vue' export from the 'ai' package, not the React default.
  • Hydration mismatches can occur if server-rendered messages don't match client-side initial state.
4

Optimize Shared State with Pinia

When handling AI responses, you may need to persist data across routes or components. Use Pinia for state management, but ensure it is properly initialized to prevent cross-request state pollution on the server.

stores/chat.ts
import { defineStore } from 'pinia'

export const useChatStore = defineStore('chat', {
  state: () => ({ history: [] }),
  actions: {
    addMessage(msg) {
      this.history.push(msg)
    }
  }
})

⚠ Common Pitfalls

  • Do not define state outside the state function, as it will be shared across all users on a warm server instance.
  • Avoid storing large binary blobs in Pinia as it increases the hydration payload sent to the client.
5

Bundle Analysis and Tree Shaking

Edge runtimes have strict bundle size limits (e.g., 1MB - 4MB). Use the Nuxt build analyzer to identify and remove heavy dependencies that are not required for the edge execution path.

terminal
npx nuxi analyze

⚠ Common Pitfalls

  • Heavy ORMs or legacy libraries can easily push the bundle over the edge limit.
  • Check that 'Prisma' or 'Drizzle' are using the edge-compatible drivers (e.g., @neondatabase/serverless).

What you built

By configuring Nitro for edge presets, implementing streaming server routes, and monitoring bundle sizes, you ensure a highly responsive Nuxt 3 application. This architecture minimizes TBT (Total Blocking Time) and leverages the global distribution of modern cloud providers.