Tech April 05, 2025 2 min read

Vector Databases in Practice: Choosing and Using Them

A practical guide to selecting and operating vector databases for production RAG systems.

Why Vector Databases Matter

Every RAG system needs a vector database. It’s where your embedded documents live, and the quality of your retrieval depends heavily on choosing the right one and configuring it properly.

After building RAG systems on multiple vector database backends, here’s my practical guide to choosing and using them.

The Options

Pinecone

Best for: Getting started quickly, managed infrastructure

Pinecone is the easiest to set up and operate. The managed service means zero infrastructure overhead. The downsides: vendor lock-in, cost at scale, and limited query flexibility.

Weaviate

Best for: Hybrid search, complex queries

Weaviate natively supports hybrid (vector + keyword) search, which is crucial for production RAG systems. It’s open-source with a managed cloud option. The GraphQL API is powerful but has a learning curve.

ChromaDB

Best for: Prototyping, local development

ChromaDB is the simplest to get running. Perfect for prototyping and small-scale applications. I wouldn’t use it for production systems handling more than a few hundred thousand documents.

pgvector

Best for: Teams already using PostgreSQL, cost optimization

pgvector adds vector search to PostgreSQL. If you’re already running Postgres, this eliminates the need for a separate database. Performance is surprisingly good for most use cases, and you get full SQL capabilities alongside vector search.

Practical Recommendations

Start with pgvector

Unless you have specific requirements that demand a specialized vector database, start with pgvector. It’s the simplest to operate (it’s just Postgres), the cheapest to run, and sufficient for most RAG systems.

Index Configuration Matters

The default indexing settings in most vector databases are tuned for accuracy over speed. For production systems with latency requirements, you’ll need to tune:

  • HNSW parameters: ef_construction and M control the build-time/query-time trade-off
  • Distance metric: Cosine similarity for normalized embeddings, inner product for non-normalized
  • Index rebuild frequency: Batch updates are much more efficient than single inserts

Monitor Your Recall

Set up evaluation pipelines that test retrieval quality regularly. Embedding model updates, new content, and configuration changes can all degrade recall silently.

Conclusion

The vector database is a critical component, but it’s not where you should spend most of your engineering effort. Choose the simplest option that meets your requirements, configure it properly, monitor recall, and spend your time on chunking, embedding, and retrieval strategies instead.

Get in Touch

Have questions about this topic? Let's discuss.

Location

London, United Kingdom