Resources

100 Embeddings & Vector Search resources for developers

This resource provides a technical map for implementing embeddings and vector search, focusing on model selection, indexing strategies, and retrieval optimization. It targets developers moving beyond basic prototypes into production-grade semantic search and recommendation systems.

Embedding Models and Inference

  1. 1

    OpenAI text-embedding-3-small

    beginnerhigh

    Recommended starting point for general-purpose embeddings. Offers a 1536-dimension vector with a 'dimensions' parameter to reduce size without retraining.

  2. 2

    Voyage AI voyage-2

    intermediatehigh

    High-performance model optimized for retrieval tasks. Consistently ranks at the top of the MTEB leaderboard for document search.

  3. 3

    Cohere Embed v3

    intermediatehigh

    Features a 'compression' parameter for binary or int8 quantization, significantly reducing storage costs in vector databases.

  4. 4

    Hugging Face sentence-transformers

    intermediatestandard

    Standard Python library for running open-source models like BGE-large or All-MiniLM-L6-v2 locally or on private infrastructure.

  5. 5

    Jina Embeddings v2

    intermediatemedium

    Supports an 8k token context window, making it suitable for long-form document embedding where chunking might lose context.

  6. 6

    MTEB Benchmark

    beginnerstandard

    Massive Text Embedding Benchmark. Use this to compare model performance across specific tasks like clustering, retrieval, and summarization.

  7. 7

    Local Inference with ONNX Runtime

    advancedmedium

    Convert Hugging Face models to ONNX format to run embeddings in Node.js or C# environments with minimal latency.

  8. 8

    Mixedbread.ai bge-m3

    intermediatemedium

    A versatile model supporting multi-lingual, multi-functional (dense/sparse), and multi-granularity retrieval within a single embedding.

  9. 9

    Infinity Embedding Server

    advancedstandard

    A high-throughput, MIT-licensed inference server for deploying open-source embedding models via a REST API.

  10. 10

    Google Vertex AI Embeddings

    beginnermedium

    Enterprise-grade embedding models integrated with GCP, offering high rate limits and managed infrastructure.

Vector Databases and Storage

  1. 1

    pgvector for PostgreSQL

    beginnerhigh

    Extension enabling vector similarity search in Postgres. Best for teams already using RDS or Supabase to avoid adding new infra.

  2. 2

    Qdrant

    intermediatehigh

    Rust-based vector database. Offers advanced payload filtering and high-performance HNSW indexing for production scale.

  3. 3

    Pinecone Serverless

    beginnerhigh

    Managed vector database that scales based on usage. Ideal for applications with unpredictable traffic and large datasets.

  4. 4

    Weaviate

    intermediatemedium

    Open-source vector DB with a GraphQL interface and native support for multi-modal data (images, video, text).

  5. 5

    ChromaDB

    beginnerstandard

    Lightweight, developer-centric database often used for local RAG development and small-scale deployments.

  6. 6

    Milvus

    advancedmedium

    Cloud-native vector database designed for billion-scale vector search with decoupled storage and compute.

  7. 7

    LanceDB

    intermediatehigh

    Serverless, disk-based vector database that stores data in the Lance format, optimized for random access and large-scale AI data.

  8. 8

    FAISS (Facebook AI Similarity Search)

    advancedstandard

    Library for efficient similarity search. Essential for building custom indexing pipelines or running in-memory searches.

  9. 9

    Elasticsearch Vector Search

    intermediatemedium

    Utilizes k-NN plugin for vector search. Best for organizations already invested in the ELK stack for logging and search.

  10. 10

    RedisVL

    intermediatemedium

    The Vector Library for Redis, enabling low-latency vector indexing and search directly within a Redis instance.

Retrieval and Optimization Patterns

  1. 1

    Hybrid Search (BM25 + Vector)

    intermediatehigh

    Combines traditional keyword search with semantic vector search using Reciprocal Rank Fusion (RRF) for better accuracy.

  2. 2

    Cohere Rerank

    beginnerhigh

    A cross-encoder model used as a second stage to re-score the top results from a vector search for higher precision.

  3. 3

    HNSW (Hierarchical Navigable Small World)

    advancedstandard

    The industry-standard algorithm for approximate nearest neighbor (ANN) search, balancing speed and recall.

  4. 4

    Product Quantization (PQ)

    advancedmedium

    A compression technique that divides vectors into sub-vectors to reduce memory usage by up to 90% at a slight cost to accuracy.

  5. 5

    Maximal Marginal Relevance (MMR)

    intermediatemedium

    A retrieval strategy that re-ranks results to reduce redundancy and increase the diversity of the returned items.

  6. 6

    Parent-Document Retrieval

    intermediatehigh

    Technique where you embed small chunks for search but return the full parent document to the LLM for better context.

  7. 7

    Query Expansion via LLM

    intermediatemedium

    Using an LLM to generate multiple versions of a user's query to catch more relevant vectors during retrieval.

  8. 8

    Metadata Pre-filtering

    beginnerstandard

    Applying hard filters (e.g., date > 2023) before performing the vector similarity search to narrow the search space.

  9. 9

    Contextual Compression

    intermediatemedium

    Filtering out irrelevant parts of retrieved documents before passing them to an LLM to save on token costs.

  10. 10

    Multi-Vector Indexing

    advancedmedium

    Assigning multiple vectors to a single document (e.g., summary vector and full text vector) to improve retrieval hits.