Vector databases explained for business — what they are, when you need one, how to pick between the major options, and what they actually cost.
Vector databases are one of those infrastructure topics that have moved from "AI research curiosity" to "thing your CTO needs to understand" surprisingly quickly. In 2026, any business building custom AI that reasons over its own content is making a vector database decision — whether they realise it or not. This guide explains vector databases for business decision-makers, walks through the main options, and gives you a framework for choosing.
A vector database stores embeddings. An embedding is a numerical representation of a piece of content — usually a list of 1,000 to 3,000 numbers — generated by an embedding model. Two pieces of content with similar meaning have similar embeddings.
That property is the whole point. It lets you search by meaning instead of by keyword. Ask "how do we onboard new tenants in NSW" and the database can return documents about "tenant setup process for New South Wales" even though no exact word matches.
A vector database is, in essence, a database optimised for one operation: "find me the N items closest to this query vector". It does that fast, at scale, and with the metadata filtering and operational tooling that production systems need.
You do not need a vector database to use ChatGPT, Claude, Microsoft Copilot, or Notion AI. Those tools have their own retrieval layers built in. See our pillar on choosing AI tools for business — most businesses get a long way without ever touching a vector database directly.
You do need a vector database when you build a custom AI system over your own content. Typical triggers:
If none of those describe you, skip this article. If they do, the rest of this is for you.
The vector database market has consolidated somewhat. The serious options for business use are:
The most popular dedicated vector database. Fully managed, easy to start, strong performance at scale. Expensive once you grow — pricing is based on storage, indexing, and query volume. Best for teams that want a managed service and do not mind paying for it.
Open-source, with a managed cloud option. Strong hybrid search (vector + keyword) out of the box. Good Australia-region support. Slightly steeper learning curve than Pinecone but more capable.
Open-source, performant, written in Rust. Increasingly popular for self-hosting. Clean API, good documentation, supports advanced filtering. A favourite for technical teams.
The "right answer" for many mid-market businesses. Adds vector support to Postgres, which most teams are already running. Performance has improved dramatically with HNSW indexes since 2023. Hard to beat for cost and operational simplicity if you already run Postgres.
The cloud-native options. If you are already deep in Azure or AWS, these are pragmatic choices — vector search is built into search products you may already be paying for. Less specialised but well-supported by your cloud provider.
A simple framework. Walk through these questions in order.
If yes, start with pgvector or your cloud's native vector option. The operational simplicity is enormous. You do not need a separate database for embeddings unless you have a clear reason.
Hybrid search combines vector similarity with traditional keyword search. It consistently improves retrieval quality. Weaviate, Qdrant, Azure AI Search and OpenSearch have strong native hybrid support. Pinecone and pgvector require more orchestration.
For Australian businesses with strict data residency:
If you have a strong platform team that can operate stateful infrastructure, self-hosted Qdrant or Weaviate is excellent value. If you do not, a managed service is worth the premium. Operating a poorly tuned vector database in production is its own special kind of pain.
A rough sketch of monthly costs for a mid-sized business RAG with around 1 million vectors and moderate query volume:
These numbers move around. Run the actual cost calculator for your real workload before committing.
A few mistakes we see often:
Most teams agonise over vector database selection then discover that retrieval quality is dominated by chunking strategy, embedding model choice, and re-ranking. The database matters less than you think. Pick a reasonable option and move on.
Real retrieval almost always involves metadata filters — by date, source, user permissions, document type. Pick a database that handles your filtering needs well. Pure similarity search is the textbook example; metadata-filtered search is the actual product.
Vector databases are mostly designed for content discovery, not access control. You need a separate layer that filters retrieved chunks by what the asking user can see. This is the single most common source of "the AI showed me something I should not have seen" incidents.
Abstracting your retrieval layer so you can swap vector databases later costs little and buys real flexibility. Do not write Pinecone-specific code throughout your application.
The vector database is just one component of a wider retrieval stack — embeddings, chunking, retrieval, re-ranking, generation. The whole stack matters more than any one piece. For a fuller picture, our building internal RAG systems overview covers the end-to-end architecture, and the LLM API cost management guide covers the inference economics.
If you already run Postgres and your scale is under 10 million vectors, install pgvector and start there. Otherwise, pick between Qdrant (self-host), Weaviate (hybrid search) or Pinecone (managed) based on your team's preferences. The choice is reversible — the workflow you build on top matters more.
FAQ
A vector database stores numerical representations (embeddings) of your content so you can search by meaning, not just keywords. It is what lets an AI find the right document among thousands when answering a question.
No, not for general use of ChatGPT, Claude or Copilot. You need one when you build a custom AI system that searches over your own content — typically as part of a RAG implementation.
For most mid-market use cases, yes. pgvector has matured significantly in 2024–2026 and handles millions of vectors well. Specialised vector databases pull ahead at very high scale or when you need exotic features.
Managed vector databases typically run AUD 100–3,000 per month for mid-sized business use. Self-hosted options on existing infrastructure can be near-zero marginal cost if you are already running Postgres.
Yes, but expect rework. Embeddings can usually be reused if you stay on the same embedding model. Re-ingestion takes a day to weeks depending on volume. Avoid lock-in by abstracting your retrieval layer.
Waymouth Tech · Melbourne, Australia
We’re a Melbourne-based AI implementation consultancy. We scope, build and ship production AI for Australian organisations — typically 8–14 weeks from kickoff to live, billed by scope so you know what you’ll pay before we start.
Or email hello@waymouthtech.com — usually back within 24 hours.
Continue reading
A practical decision framework for choosing AI tools for business in 2026 — covering selection criteria, build vs buy, and a tooling shortlist.
An overview of building internal RAG systems for business — architecture, tooling, costs, and the decisions that make or break a production RAG deployment.
A practical look at Notion AI for operations teams — what it does well, where it falls short, and how to roll it out without creating workspace chaos.