In this guide, we’ll explore how to effectively store, index, and query embeddings generated from OpenAI’s text-embedding-3-small
model using PostgreSQL’s pg_vector
extension. This approach is particularly powerful for building semantic search and similarity matching systems.
Why PostgreSQL with pg_vector?
- Native Vector Operations:
pg_vector
allows PostgreSQL to handle vector operations natively - Efficient Indexing: Supports fast similarity searches using IVFFlat indexes
- Production Ready: Scales well with large datasets
- Cost-Effective: Cheaper than specialized vector databases
Setting Up PostgreSQL with pg_vector
First, let’s set up our database with the necessary extension and schema:
-- Enable the vector extension
CREATE EXTENSION IF NOT EXISTS vector;
-- Create a table for our embeddings
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT NOT NULL,
embedding vector(1536), -- dimension size for text-embedding-3-small
metadata JSONB,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
-- Create an IVFFlat index for faster similarity searches
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
TypeScript Implementation
Here’s a complete implementation showing how to store and query embeddings:
import { Pool } from "pg";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
interface Document {
id?: number;
content: string;
embedding?: number[];
metadata?: Record<string, any>;
}
class VectorStore {
private pool: Pool;
private embeddings: OpenAIEmbeddings;
constructor() {
this.pool = new Pool({
host: "localhost",
database: "your_database",
user: "your_user",
password: "your_password",
});
this.embeddings = new OpenAIEmbeddings({
modelName: "text-embedding-3-small",
});
}
async addDocuments(documents: Document[]) {
const client = await this.pool.connect();
try {
await client.query("BEGIN");
for (const doc of documents) {
// Generate embedding for the document
const [embedding] = await this.embeddings.embedDocuments([doc.content]);
// Insert document and embedding
await client.query(
`
INSERT INTO documents (content, embedding, metadata)
VALUES ($1, $2, $3)
`,
[doc.content, embedding, doc.metadata || {}]
);
}
await client.query("COMMIT");
} catch (error) {
await client.query("ROLLBACK");
throw error;
} finally {
client.release();
}
}
async similaritySearch(query: string, k: number = 5) {
// Generate embedding for the query
const [queryEmbedding] = await this.embeddings.embedDocuments([query]);
// Perform similarity search
const result = await this.pool.query(
`
SELECT
content,
metadata,
1 - (embedding <=> $1) as similarity
FROM documents
ORDER BY embedding <=> $1
LIMIT $2
`,
[queryEmbedding, k]
);
return result.rows;
}
}
// Usage Example
async function main() {
const vectorStore = new VectorStore();
// Adding documents
await vectorStore.addDocuments([
{
content: "TypeScript is a strongly typed programming language.",
metadata: { category: "programming", language: "typescript" },
},
{
content: "PostgreSQL is a powerful open-source database.",
metadata: { category: "database", type: "relational" },
},
]);
// Performing similarity search
const results = await vectorStore.similaritySearch(
"What programming languages are typed?"
);
console.log(results);
}
Understanding the Implementation
1. Table Structure
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT NOT NULL,
embedding vector(1536),
metadata JSONB,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
content
: Stores the original textembedding
: Vector representation from OpenAImetadata
: Flexible JSONB field for additional informationcreated_at
: Timestamp for tracking
2. Adding Documents
async addDocuments(documents: Document[]) {
// ... transaction handling ...
// Generate embedding for the document
const [embedding] = await this.embeddings.embedDocuments([doc.content]);
// Insert document and embedding
await client.query(
`INSERT INTO documents (content, embedding, metadata)
VALUES ($1, $2, $3)`,
[doc.content, embedding, doc.metadata || {}]
);
}
This method:
- Opens a transaction
- Generates embeddings using OpenAI
- Stores both content and embeddings
- Uses transactions for data integrity
3. Similarity Search
async similaritySearch(query: string, k: number = 5) {
const [queryEmbedding] = await this.embeddings.embedDocuments([query]);
return await this.pool.query(
`SELECT
content,
metadata,
1 - (embedding <=> $1) as similarity
FROM documents
ORDER BY embedding <=> $1
LIMIT $2`,
[queryEmbedding, k]
);
}
Key points:
<=>
is the cosine distance operator1 - (embedding <=> $1)
converts distance to similarity score- Uses the IVFFlat index for efficient searching
Performance Optimization Tips
-
Batch Insertions: When adding many documents, use
COPY
or batch inserts -
Index Tuning: Adjust the
lists
parameter based on your dataset size:-- Rule of thumb: lists = sqrt(number_of_rows) CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
-
Connection Pooling: Use connection pooling for better performance
-
Regular VACUUM: Schedule regular VACUUM operations to maintain index efficiency
Example Usage
// Initialize the vector store
const vectorStore = new VectorStore();
// Add documents
await vectorStore.addDocuments([
{
content: "TypeScript adds optional static types to JavaScript",
metadata: { topic: "programming" }
}
]);
// Search for similar content
const results = await vectorStore.similaritySearch(
"What is static typing?",
5
);
console.log(results);
Conclusion
Using PostgreSQL with pg_vector
provides a robust, scalable solution for storing and querying embeddings. The implementation above gives you a foundation for building semantic search systems, recommendation engines, or any application requiring vector similarity search.
Remember to:
- Monitor index performance
- Tune parameters based on your dataset
- Keep your PostgreSQL and pg_vector versions updated
- Consider batch operations for large datasets
This setup provides an excellent balance of performance, cost, and maintainability for production systems working with embeddings.