ai-development openai embeddingssemantic searchvector embeddings

OpenAI Embeddings API: Complete Semantic Search Guide

Master semantic search with OpenAI embeddings. Complete implementation guide with code examples, best practices, and real-world patterns for developers.

📖 20 min read 📅 April 7, 2026 ✍ By PropTechUSA AI
20m
Read Time
3.9k
Words
18
Sections

The evolution from keyword-based to semantic search represents one of the most significant advances in information retrieval technology. While traditional search relies on exact matches and keyword frequency, semantic search understands context, intent, and meaning. OpenAI's Embeddings [API](/workers) has democratized access to this powerful capability, enabling developers to build sophisticated search experiences that truly understand what users are looking for.

Understanding Vector Embeddings and Semantic Similarity

Vector embeddings transform text into high-dimensional numerical representations that capture semantic meaning. Unlike traditional keyword matching, these dense vectors encode relationships between concepts, allowing machines to understand that "apartment" and "residence" are semantically similar, even without shared characters.

The Mathematics Behind Semantic Understanding

OpenAI embeddings convert text into 1536-dimensional vectors using transformer-based neural networks. Each dimension represents learned features about language patterns, context, and meaning. The key insight is that semantically similar texts produce vectors that are close together in this high-dimensional space.

Semantic similarity is typically measured using cosine similarity, which calculates the angle between two vectors. Values range from -1 (opposite meaning) to 1 (identical meaning), with higher scores indicating greater semantic similarity.

python
import numpy as np

from scipy.spatial.distance import cosine

def calculate_similarity(embedding1, embedding2):

"""

Calculate cosine similarity between two embeddings

"""

return 1 - cosine(embedding1, embedding2)

vector_apartment = [0.2, 0.8, 0.3, 0.1] # Simplified representation

vector_residence = [0.25, 0.75, 0.35, 0.12] # Similar semantic meaning

vector_automobile = [0.7, 0.1, 0.9, 0.4] # Different semantic domain

print(f"Apartment vs Residence: {calculate_similarity(vector_apartment, vector_residence):.3f}")

print(f"Apartment vs Automobile: {calculate_similarity(vector_apartment, vector_automobile):.3f}")

Why Traditional Search Falls Short

Traditional keyword-based search struggles with several fundamental limitations:

Semantic search addresses these challenges by understanding meaning rather than matching characters.

Real-World Applications in Property Technology

At PropTechUSA.ai, we've observed how semantic search transforms property discovery. A user searching for "pet-friendly downtown loft" might find relevant listings described as "urban studio welcoming animals" or "city center apartment allowing pets" – matches that keyword search would miss entirely.

OpenAI Embeddings API: Technical Implementation

The OpenAI Embeddings API provides a straightforward interface for generating high-quality vector embeddings. The current text-embedding-ada-002 model offers excellent performance across diverse text types while maintaining cost efficiency.

API Configuration and Setup

Begin by installing the necessary dependencies and configuring your environment:

typescript
npm install openai @pinecone-database/pinecone dotenv

typescript
import { OpenAI } from 'openai';

import { config } from 'dotenv';

config();

const openai = new OpenAI({

apiKey: process.env.OPENAI_API_KEY,

});

interface EmbeddingResponse {

embedding: number[];

usage: {

prompt_tokens: number;

total_tokens: number;

};

}

async function generateEmbedding(text: string): Promise<EmbeddingResponse> {

try {

const response = await openai.embeddings.create({

model: 'text-embedding-ada-002',

input: text,

encoding_format: 'float',

});

return {

embedding: response.data[0].embedding,

usage: response.usage

};

} catch (error) {

console.error('Error generating embedding:', error);

throw error;

}

}

Building a Complete Semantic Search System

A production semantic search system requires several components: embedding generation, vector storage, similarity computation, and result ranking. Here's a comprehensive implementation:

typescript
interface Document {

id: string;

content: string;

metadata: Record<string, any>;

embedding?: number[];

}

class SemanticSearchEngine {

private documents: Map<string, Document> = new Map();

private openai: OpenAI;

constructor(apiKey: string) {

this.openai = new OpenAI({ apiKey });

}

async addDocument(document: Document): Promise<void> {

// Generate embedding for the document

const embeddingResponse = await this.generateEmbedding(document.content);

document.embedding = embeddingResponse.embedding;

this.documents.set(document.id, document);

}

async search(query: string, limit: number = 10): Promise<SearchResult[]> {

// Generate embedding for the search query

const queryEmbeddingResponse = await this.generateEmbedding(query);

const queryEmbedding = queryEmbeddingResponse.embedding;

// Calculate similarities

const similarities: SearchResult[] = [];

for (const [id, document] of this.documents) {

if (!document.embedding) continue;

const similarity = this.cosineSimilarity(queryEmbedding, document.embedding);

similarities.push({

document,

similarity,

score: similarity * 100

});

}

// Sort by similarity and return top results

return similarities

.sort((a, b) => b.similarity - a.similarity)

.slice(0, limit);

}

private async generateEmbedding(text: string): Promise<{ embedding: number[] }> {

const response = await this.openai.embeddings.create({

model: 'text-embedding-ada-002',

input: text,

});

return { embedding: response.data[0].embedding };

}

private cosineSimilarity(a: number[], b: number[]): number {

const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);

const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));

const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));

return dotProduct / (magnitudeA * magnitudeB);

}

}

interface SearchResult {

document: Document;

similarity: number;

score: number;

}

Batch Processing and Performance Optimization

For large document collections, process embeddings in batches to improve efficiency:

typescript
class BatchEmbeddingProcessor {

private openai: OpenAI;

private batchSize: number;

constructor(apiKey: string, batchSize: number = 100) {

this.openai = new OpenAI({ apiKey });

this.batchSize = batchSize;

}

async processBatch(texts: string[]): Promise<number[][]> {

const batches = this.chunkArray(texts, this.batchSize);

const allEmbeddings: number[][] = [];

for (const batch of batches) {

try {

const response = await this.openai.embeddings.create({

model: 'text-embedding-ada-002',

input: batch,

});

const batchEmbeddings = response.data.map(item => item.embedding);

allEmbeddings.push(...batchEmbeddings);

// Rate limiting: pause between batches

await this.delay(100);

} catch (error) {

console.error('Batch processing error:', error);

throw error;

}

}

return allEmbeddings;

}

private chunkArray<T>(array: T[], size: number): T[][] {

const chunks: T[][] = [];

for (let i = 0; i < array.length; i += size) {

chunks.push(array.slice(i, i + size));

}

return chunks;

}

private delay(ms: number): Promise<void> {

return new Promise(resolve => setTimeout(resolve, ms));

}

}

Vector Database Integration and Scalability

For production applications, storing embeddings in memory isn't practical. Vector databases provide optimized storage and retrieval for high-dimensional embeddings.

Pinecone Integration Example

Pinecone offers a managed vector database service optimized for similarity search:

typescript
import { Pinecone } from '@pinecone-database/pinecone';

class PineconeSemanticSearch {

private pinecone: Pinecone;

private indexName: string;

private openai: OpenAI;

constructor(pineconeApiKey: string, openaiApiKey: string, indexName: string) {

this.pinecone = new Pinecone({ apiKey: pineconeApiKey });

this.openai = new OpenAI({ apiKey: openaiApiKey });

this.indexName = indexName;

}

async initializeIndex(): Promise<void> {

try {

await this.pinecone.createIndex({

name: this.indexName,

dimension: 1536, // OpenAI embedding dimension

metric: 'cosine',

spec: {

serverless: {

cloud: 'aws',

region: 'us-east-1'

}

}

});

} catch (error) {

console.log('Index may already exist:', error.message);

}

}

async upsertDocument(document: Document): Promise<void> {

const index = this.pinecone.index(this.indexName);

// Generate embedding

const embeddingResponse = await this.openai.embeddings.create({

model: 'text-embedding-ada-002',

input: document.content,

});

// Upsert to Pinecone

await index.upsert([{

id: document.id,

values: embeddingResponse.data[0].embedding,

metadata: {

content: document.content,

...document.metadata

}

}]);

}

async search(query: string, topK: number = 10): Promise<SearchResult[]> {

const index = this.pinecone.index(this.indexName);

// Generate query embedding

const queryEmbedding = await this.openai.embeddings.create({

model: 'text-embedding-ada-002',

input: query,

});

// Search Pinecone

const searchResponse = await index.query({

vector: queryEmbedding.data[0].embedding,

topK,

includeMetadata: true

});

return searchResponse.matches?.map(match => ({

id: match.id,

content: match.metadata?.content as string,

similarity: match.score || 0,

metadata: match.metadata

})) || [];

}

}

Hybrid Search Implementation

Combining semantic search with traditional keyword search often produces superior results:

typescript
interface HybridSearchResult {

document: Document;

semanticScore: number;

keywordScore: number;

combinedScore: number;

}

class HybridSearchEngine {

private semanticEngine: SemanticSearchEngine;

private keywordEngine: KeywordSearchEngine;

constructor(openaiApiKey: string) {

this.semanticEngine = new SemanticSearchEngine(openaiApiKey);

this.keywordEngine = new KeywordSearchEngine();

}

async hybridSearch(

query: string,

semanticWeight: number = 0.7,

keywordWeight: number = 0.3,

limit: number = 10

): Promise<HybridSearchResult[]> {

// Perform both searches concurrently

const [semanticResults, keywordResults] = await Promise.all([

this.semanticEngine.search(query, limit * 2),

this.keywordEngine.search(query, limit * 2)

]);

// Combine and normalize scores

const combinedResults = this.combineResults(

semanticResults,

keywordResults,

semanticWeight,

keywordWeight

);

return combinedResults

.sort((a, b) => b.combinedScore - a.combinedScore)

.slice(0, limit);

}

private combineResults(

semanticResults: SearchResult[],

keywordResults: SearchResult[],

semanticWeight: number,

keywordWeight: number

): HybridSearchResult[] {

const resultMap = new Map<string, HybridSearchResult>();

// Process semantic results

semanticResults.forEach((result, index) => {

const normalizedScore = (semanticResults.length - index) / semanticResults.length;

resultMap.set(result.document.id, {

document: result.document,

semanticScore: normalizedScore,

keywordScore: 0,

combinedScore: normalizedScore * semanticWeight

});

});

// Add keyword results

keywordResults.forEach((result, index) => {

const normalizedScore = (keywordResults.length - index) / keywordResults.length;

const existing = resultMap.get(result.document.id);

if (existing) {

existing.keywordScore = normalizedScore;

existing.combinedScore += normalizedScore * keywordWeight;

} else {

resultMap.set(result.document.id, {

document: result.document,

semanticScore: 0,

keywordScore: normalizedScore,

combinedScore: normalizedScore * keywordWeight

});

}

});

return Array.from(resultMap.values());

}

}

Production Best Practices and Optimization

Building production-ready semantic search requires attention to performance, cost, and user experience considerations.

Embedding Caching and Storage Strategy

Embeddings are expensive to generate but cheap to store. Implement comprehensive caching:

typescript
class EmbeddingCache {

private cache = new Map<string, CachedEmbedding>();

private maxAge = 7 * 24 * 60 * 60 * 1000; // 7 days

async getEmbedding(text: string, openai: OpenAI): Promise<number[]> {

const hash = this.hashText(text);

const cached = this.cache.get(hash);

if (cached && !this.isExpired(cached)) {

return cached.embedding;

}

// Generate new embedding

const response = await openai.embeddings.create({

model: 'text-embedding-ada-002',

input: text,

});

const embedding = response.data[0].embedding;

// Cache the result

this.cache.set(hash, {

embedding,

timestamp: Date.now(),

text: text.substring(0, 100) // Store snippet for debugging

});

return embedding;

}

private hashText(text: string): string {

// Simple hash function - use crypto.createHash in production

return Buffer.from(text).toString('base64');

}

private isExpired(cached: CachedEmbedding): boolean {

return Date.now() - cached.timestamp > this.maxAge;

}

}

interface CachedEmbedding {

embedding: number[];

timestamp: number;

text: string;

}

Query Optimization and Result Ranking

Implement sophisticated ranking that considers multiple factors:

typescript
class AdvancedRanking {

static rankResults(

results: SearchResult[],

query: string,

options: RankingOptions = {}

): SearchResult[] {

const {

boostRecent = true,

boostPopular = true,

diversityFactor = 0.1

} = options;

return results.map(result => {

let score = result.similarity;

// Boost recent content

if (boostRecent && result.document.metadata.publishedAt) {

const age = Date.now() - new Date(result.document.metadata.publishedAt).getTime();

const daysSincePublished = age / (1000 * 60 * 60 * 24);

const recencyBoost = Math.max(0, 1 - daysSincePublished / 365); // Decay over a year

score += recencyBoost * 0.1;

}

// Boost popular content

if (boostPopular && result.document.metadata.popularity) {

const popularityBoost = Math.min(result.document.metadata.popularity / 1000, 0.1);

score += popularityBoost;

}

// Apply diversity penalty for very similar results

const titleSimilarity = this.calculateTitleSimilarity(

query,

result.document.metadata.title || ''

);

score += titleSimilarity * 0.05;

return {

...result,

similarity: score

};

}).sort((a, b) => b.similarity - a.similarity);

}

private static calculateTitleSimilarity(query: string, title: string): number {

const queryWords = new Set(query.toLowerCase().split(' '));

const titleWords = new Set(title.toLowerCase().split(' '));

const intersection = new Set([...queryWords].filter(x => titleWords.has(x)));

const union = new Set([...queryWords, ...titleWords]);

return intersection.size / union.size; // Jaccard similarity

}

}

interface RankingOptions {

boostRecent?: boolean;

boostPopular?: boolean;

diversityFactor?: number;

}

Monitoring and [Analytics](/dashboards)

Track search performance and user behavior:

💡
Pro TipImplement comprehensive logging to understand query patterns and optimize your semantic search system.

typescript
class SearchAnalytics {

private metrics: SearchMetric[] = [];

logSearch(query: string, results: SearchResult[], userId?: string): void {

const metric: SearchMetric = {

timestamp: new Date(),

query,

resultCount: results.length,

topScore: results[0]?.similarity || 0,

avgScore: results.reduce((sum, r) => sum + r.similarity, 0) / results.length,

userId,

queryLength: query.length,

hasResults: results.length > 0

};

this.metrics.push(metric);

// Send to analytics service

this.sendToAnalytics(metric);

}

generateReport(timeframe: 'day' | 'week' | 'month'): AnalyticsReport {

const cutoff = this.getCutoffDate(timeframe);

const recentMetrics = this.metrics.filter(m => m.timestamp > cutoff);

return {

totalQueries: recentMetrics.length,

avgResultCount: recentMetrics.reduce((sum, m) => sum + m.resultCount, 0) / recentMetrics.length,

successRate: recentMetrics.filter(m => m.hasResults).length / recentMetrics.length,

topQueries: this.getTopQueries(recentMetrics),

avgQueryLength: recentMetrics.reduce((sum, m) => sum + m.queryLength, 0) / recentMetrics.length

};

}

private getCutoffDate(timeframe: string): Date {

const now = new Date();

switch (timeframe) {

case 'day': return new Date(now.getTime() - 24 * 60 * 60 * 1000);

case 'week': return new Date(now.getTime() - 7 * 24 * 60 * 60 * 1000);

case 'month': return new Date(now.getTime() - 30 * 24 * 60 * 60 * 1000);

default: return new Date(0);

}

}

private getTopQueries(metrics: SearchMetric[]): { query: string; count: number }[] {

const queryCount = new Map<string, number>();

metrics.forEach(m => {

queryCount.set(m.query, (queryCount.get(m.query) || 0) + 1);

});

return Array.from(queryCount.entries())

.map(([query, count]) => ({ query, count }))

.sort((a, b) => b.count - a.count)

.slice(0, 10);

}

private sendToAnalytics(metric: SearchMetric): void {

// Implement your analytics service integration

console.log('Search metric:', metric);

}

}

interface SearchMetric {

timestamp: Date;

query: string;

resultCount: number;

topScore: number;

avgScore: number;

userId?: string;

queryLength: number;

hasResults: boolean;

}

interface AnalyticsReport {

totalQueries: number;

avgResultCount: number;

successRate: number;

topQueries: { query: string; count: number }[];

avgQueryLength: number;

}

⚠️
WarningAlways implement rate limiting and error handling when working with external APIs to ensure system reliability.

Advanced Techniques and Future Considerations

As semantic search technology evolves, several advanced techniques can further enhance search quality and user experience.

While OpenAI's general-purpose embeddings work well across domains, fine-tuning can improve performance for specific use cases. In our experience at PropTechUSA.ai, property-specific fine-tuning has improved search relevance for real estate terminology and concepts.

Multi-modal Search Implementation

Extending semantic search beyond text to include images, documents, and structured data creates richer search experiences:

typescript
interface MultiModalDocument {

id: string;

textContent: string;

imageDescriptions: string[];

metadata: {

documentType: 'property' | 'contract' | 'report';

location?: string;

price?: number;

features?: string[];

};

}

class MultiModalSearchEngine {

async createCompositeEmbedding(document: MultiModalDocument): Promise<number[]> {

const textEmbedding = await this.generateEmbedding(document.textContent);

// Combine image descriptions

const imageText = document.imageDescriptions.join(' ');

const imageEmbedding = imageText ? await this.generateEmbedding(imageText) : null;

// Weight and combine embeddings

if (imageEmbedding) {

return this.weightedCombine(textEmbedding, imageEmbedding, 0.8, 0.2);

}

return textEmbedding;

}

private weightedCombine(

embedding1: number[],

embedding2: number[],

weight1: number,

weight2: number

): number[] {

return embedding1.map((val, idx) =>

val * weight1 + embedding2[idx] * weight2

);

}

}

The future of semantic search lies in understanding user intent, personalizing results, and seamlessly integrating multiple data modalities. As OpenAI continues to improve their embedding models and new techniques emerge, the gap between human understanding and machine comprehension continues to narrow.

Implementing semantic search with OpenAI embeddings represents a significant step forward in creating intuitive, intelligent search experiences. The techniques and patterns outlined in this guide provide a solid foundation for building production-ready systems that truly understand what users are looking for.

Ready to implement semantic search in your applications? Start with the basic patterns shown here, then gradually incorporate advanced features like hybrid search, sophisticated ranking, and comprehensive analytics. The investment in semantic search technology pays dividends in user satisfaction and engagement – transforming how people discover and interact with your content.

🚀 Ready to Build?

Let's discuss how we can help with your project.

Start Your Project →