Skip to content

Storing and Querying OpenAI Embeddings in PostgreSQL with pg_vector

Published: at 02:12 AM

In this guide, we’ll explore how to effectively store, index, and query embeddings generated from OpenAI’s text-embedding-3-small model using PostgreSQL’s pg_vector extension. This approach is particularly powerful for building semantic search and similarity matching systems.

Why PostgreSQL with pg_vector?

  1. Native Vector Operations: pg_vector allows PostgreSQL to handle vector operations natively
  2. Efficient Indexing: Supports fast similarity searches using IVFFlat indexes
  3. Production Ready: Scales well with large datasets
  4. Cost-Effective: Cheaper than specialized vector databases

Setting Up PostgreSQL with pg_vector

First, let’s set up our database with the necessary extension and schema:

-- Enable the vector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create a table for our embeddings
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    embedding vector(1536), -- dimension size for text-embedding-3-small
    metadata JSONB,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Create an IVFFlat index for faster similarity searches
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

TypeScript Implementation

Here’s a complete implementation showing how to store and query embeddings:

import { Pool } from "pg";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

interface Document {
  id?: number;
  content: string;
  embedding?: number[];
  metadata?: Record<string, any>;
}

class VectorStore {
  private pool: Pool;
  private embeddings: OpenAIEmbeddings;

  constructor() {
    this.pool = new Pool({
      host: "localhost",
      database: "your_database",
      user: "your_user",
      password: "your_password",
    });

    this.embeddings = new OpenAIEmbeddings({
      modelName: "text-embedding-3-small",
    });
  }

  async addDocuments(documents: Document[]) {
    const client = await this.pool.connect();

    try {
      await client.query("BEGIN");

      for (const doc of documents) {
        // Generate embedding for the document
        const [embedding] = await this.embeddings.embedDocuments([doc.content]);

        // Insert document and embedding
        await client.query(
          `
          INSERT INTO documents (content, embedding, metadata)
          VALUES ($1, $2, $3)
          `,
          [doc.content, embedding, doc.metadata || {}]
        );
      }

      await client.query("COMMIT");
    } catch (error) {
      await client.query("ROLLBACK");
      throw error;
    } finally {
      client.release();
    }
  }

  async similaritySearch(query: string, k: number = 5) {
    // Generate embedding for the query
    const [queryEmbedding] = await this.embeddings.embedDocuments([query]);

    // Perform similarity search
    const result = await this.pool.query(
      `
      SELECT
        content,
        metadata,
        1 - (embedding <=> $1) as similarity
      FROM documents
      ORDER BY embedding <=> $1
      LIMIT $2
      `,
      [queryEmbedding, k]
    );

    return result.rows;
  }
}

// Usage Example
async function main() {
  const vectorStore = new VectorStore();

  // Adding documents
  await vectorStore.addDocuments([
    {
      content: "TypeScript is a strongly typed programming language.",
      metadata: { category: "programming", language: "typescript" },
    },
    {
      content: "PostgreSQL is a powerful open-source database.",
      metadata: { category: "database", type: "relational" },
    },
  ]);

  // Performing similarity search
  const results = await vectorStore.similaritySearch(
    "What programming languages are typed?"
  );
  console.log(results);
}

Understanding the Implementation

1. Table Structure

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    embedding vector(1536),
    metadata JSONB,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

2. Adding Documents

async addDocuments(documents: Document[]) {
  // ... transaction handling ...

  // Generate embedding for the document
  const [embedding] = await this.embeddings.embedDocuments([doc.content]);

  // Insert document and embedding
  await client.query(
    `INSERT INTO documents (content, embedding, metadata)
     VALUES ($1, $2, $3)`,
    [doc.content, embedding, doc.metadata || {}]
  );
}

This method:

  1. Opens a transaction
  2. Generates embeddings using OpenAI
  3. Stores both content and embeddings
  4. Uses transactions for data integrity
async similaritySearch(query: string, k: number = 5) {
  const [queryEmbedding] = await this.embeddings.embedDocuments([query]);

  return await this.pool.query(
    `SELECT
      content,
      metadata,
      1 - (embedding <=> $1) as similarity
    FROM documents
    ORDER BY embedding <=> $1
    LIMIT $2`,
    [queryEmbedding, k]
  );
}

Key points:

Performance Optimization Tips

  1. Batch Insertions: When adding many documents, use COPY or batch inserts

  2. Index Tuning: Adjust the lists parameter based on your dataset size:

    -- Rule of thumb: lists = sqrt(number_of_rows)
    CREATE INDEX ON documents
    USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);
    
  3. Connection Pooling: Use connection pooling for better performance

  4. Regular VACUUM: Schedule regular VACUUM operations to maintain index efficiency

Example Usage

// Initialize the vector store
const vectorStore = new VectorStore();

// Add documents
await vectorStore.addDocuments([
  {
    content: "TypeScript adds optional static types to JavaScript",
    metadata: { topic: "programming" }
  }
]);

// Search for similar content
const results = await vectorStore.similaritySearch(
  "What is static typing?",
  5
);

console.log(results);

Conclusion

Using PostgreSQL with pg_vector provides a robust, scalable solution for storing and querying embeddings. The implementation above gives you a foundation for building semantic search systems, recommendation engines, or any application requiring vector similarity search.

Remember to:

This setup provides an excellent balance of performance, cost, and maintainability for production systems working with embeddings.


Previous Post
My .cursorrules configuration for full-stack TS/Next.js development
Next Post
The Best Way to Chunk Text Data for Generating Embeddings with OpenAI Models