When Airbnb scaled their search infrastructure to handle millions of [property](/offer-check) listings across 200+ countries, they didn't just deploy Elasticsearch—they architected a production-ready system that could handle 100,000+ queries per second while maintaining sub-100ms response times. The difference between a basic Elasticsearch setup and a production-ready search architecture often determines whether your application scales gracefully or fails under pressure.
Building enterprise-grade search infrastructure requires more than spinning up a few Elasticsearch nodes. It demands understanding cluster topologies, data modeling strategies, monitoring frameworks, and disaster recovery patterns that keep your search functionality running 24/7.
Understanding Elasticsearch Production Requirements
Production Elasticsearch deployments differ dramatically from development environments. While a single-node cluster might suffice for prototyping, production workloads demand resilience, scalability, and performance optimization across multiple dimensions.
High Availability Architecture Patterns
Elasticsearch achieves high availability through distributed cluster architecture. A production-ready cluster typically consists of dedicated master nodes, data nodes, and coordinating nodes, each serving specific functions within the search ecosystem.
Master nodes handle cluster-wide operations like index creation, node discovery, and shard allocation. Running three dedicated master nodes prevents split-brain scenarios while ensuring cluster coordination remains stable during node failures.
Data nodes store actual documents and execute search queries. These nodes require substantial memory and fast storage to maintain query performance under load. The number of data nodes depends on your data volume and query throughput requirements.
node.name: master-node-1
node.roles: [master]
cluster.name: production-search
network.host: 10.0.1.10
discovery.seed_hosts: ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
cluster.initial_master_nodes: ["master-node-1", "master-node-2", "master-node-3"]
xpack.security.enabled: true
Capacity Planning and Resource Allocation
Effective capacity planning starts with understanding your data characteristics and query patterns. Document size, field types, indexing frequency, and search complexity all influence resource requirements.
Memory allocation follows the 50/50 rule: allocate 50% of available RAM to Elasticsearch heap, leaving the remaining 50% for the operating system's file cache. Never exceed 32GB for heap size due to Java's compressed ordinary object pointers limitation.
Storage planning considers both primary and replica shards, plus overhead for merging, snapshots, and temporary operations. Plan for 20-30% additional storage beyond your raw data size to accommodate these operations.
Core Architecture Components and Design Patterns
A robust Elasticsearch production architecture incorporates multiple layers of abstraction, from load balancing to data partitioning strategies that optimize both ingestion and query performance.
Cluster Topology Design
Modern Elasticsearch deployments often implement a three-tier architecture: coordinating nodes, master nodes, and data nodes. This separation allows independent scaling of different workload types.
Coordinating nodes handle client requests, aggregate results from data nodes, and manage query routing. These nodes require minimal storage but benefit from high CPU and memory for query coordination tasks.
// Elasticsearch client configuration for production
import { Client } from '@elastic/elasticsearch';
const esClient = new Client({
nodes: [
'https://coord-1.search.company.com:9200',
'https://coord-2.search.company.com:9200',
'https://coord-3.search.company.com:9200'
],
auth: {
bearer: process.env.ELASTICSEARCH_TOKEN
},
maxRetries: 3,
requestTimeout: 30000,
sniffOnStart: true,
sniffInterval: 300000
});
Index Architecture and Shard Strategy
Index design significantly impacts query performance and cluster stability. Time-based indices work well for log data, allowing efficient data lifecycle management through index rollover policies.
For property search systems like those PropTechUSA.ai implements, geographic or categorical sharding often provides better query distribution. Routing documents by region or property type ensures related searches hit fewer shards.
{
"mappings": {
"properties": {
"property_id": { "type": "keyword" },
"location": {
"type": "geo_point"
},
"address": {
"type": "text",
"[analyzer](/free-tools)": "standard",
"fields": {
"suggest": {
"type": "completion",
"analyzer": "simple"
}
}
},
"price": { "type": "scaled_float", "scaling_factor": 100 },
"features": {
"type": "nested",
"properties": {
"name": { "type": "keyword" },
"value": { "type": "text" }
}
}
}
},
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"refresh_interval": "30s",
"max_result_window": 50000
}
}
Data Ingestion [Pipeline](/custom-crm) Architecture
Production data ingestion requires careful orchestration of indexing throughput, data validation, and error handling. Bulk indexing provides the highest throughput, but batch sizing affects memory usage and error isolation.
Logstash, Beats, or custom applications can handle data transformation before indexing. For real-time requirements, consider using Elasticsearch's ingest pipelines for lightweight data processing.
// Bulk indexing with error handling
async function bulkIndexProperties(properties: Property[]) {
const body = properties.flatMap(property => ([
{ index: { _index: 'properties', _id: property.id, routing: property.region } },
{
property_id: property.id,
location: { lat: property.latitude, lon: property.longitude },
address: property.address,
price: property.price,
features: property.features
}
]));
try {
const response = await esClient.bulk({ body, refresh: 'wait_for' });
if (response.body.errors) {
const failures = response.body.items
.filter(item => item.index.error)
.map(item => item.index.error);
console.error('Indexing failures:', failures);
// Handle partial failures appropriately
}
return response.body;
} catch (error) {
console.error('Bulk indexing error:', error);
throw error;
}
}
Implementation Strategies and Deployment Patterns
Successful Elasticsearch production deployments require systematic approaches to configuration management, monitoring integration, and operational procedures that maintain system reliability.
Container Orchestration and Infrastructure as Code
Kubernetes has become the de facto standard for Elasticsearch container orchestration, providing automated scaling, rolling updates, and resource management. The Elastic Cloud on Kubernetes (ECK) operator simplifies cluster lifecycle management.
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: production-search
namespace: search-system
spec:
version: 8.11.0
nodeSets:
- name: master
count: 3
config:
node.roles: ["master"]
xpack.security.enabled: true
podTemplate:
spec:
containers:
- name: elasticsearch
resources:
requests:
memory: 2Gi
cpu: 500m
limits:
memory: 2Gi
cpu: 1000m
- name: data
count: 6
config:
node.roles: ["data", "ingest"]
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi
storageClassName: fast-ssd
podTemplate:
spec:
containers:
- name: elasticsearch
resources:
requests:
memory: 16Gi
cpu: 2000m
limits:
memory: 16Gi
cpu: 4000m
Security Implementation and Access Control
Production Elasticsearch clusters require comprehensive security including TLS encryption, authentication, and role-based access control. X-Pack Security provides enterprise-grade security features integrated with the Elastic Stack.
Implement network-level security through VPCs, security groups, and firewall rules that restrict cluster access to authorized applications and administrators.
// Role-based search client initialization
class SecureSearchClient {
private client: Client;
constructor(userRole: string) {
this.client = new Client({
nodes: process.env.ELASTICSEARCH_NODES?.split(','),
auth: {
username: search_${userRole},
password: process.env[ES_PASSWORD_${userRole.toUpperCase()}]
},
tls: {
ca: fs.readFileSync('/path/to/ca.crt'),
rejectUnauthorized: true
}
});
}
async searchProperties(query: SearchQuery, userContext: UserContext) {
// Apply user-specific filters based on access level
const filteredQuery = this.applyAccessFilters(query, userContext);
return await this.client.search({
index: 'properties',
body: filteredQuery,
timeout: '30s'
});
}
private applyAccessFilters(query: SearchQuery, context: UserContext) {
// Implement field-level and document-level security
const filters = [];
if (context.region) {
filters.push({ term: { region: context.region } });
}
return {
...query,
query: {
bool: {
must: query.query,
filter: filters
}
}
};
}
}
Monitoring and Observability Integration
Comprehensive monitoring covers cluster health, performance metrics, and application-level search [analytics](/dashboards). The Elastic Stack's built-in monitoring provides cluster insights, while custom metrics track business-specific KPIs.
Integrate with external monitoring systems like Prometheus, Datadog, or New Relic for unified observability across your infrastructure stack.
Production Best Practices and Performance Optimization
Operating Elasticsearch in production requires adherence to proven practices that prevent common pitfalls while optimizing for performance, reliability, and maintainability.
Query Optimization and Caching Strategies
Query performance directly impacts user experience and system resource utilization. Profile slow queries using the _profile [API](/workers) to identify bottlenecks in query execution.
Implement query result caching at multiple levels: Elasticsearch's native query cache, application-level caching with Redis, and CDN caching for static aggregations.
// Optimized property search with caching
class PropertySearchService {
private cache: Redis;
private esClient: Client;
async searchWithCache(searchParams: SearchParams): Promise<SearchResult> {
const cacheKey = this.generateCacheKey(searchParams);
// Check application cache first
const cached = await this.cache.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
// Optimize query for performance
const query = {
size: Math.min(searchParams.size || 20, 100),
from: searchParams.from || 0,
_source: {
includes: ['property_id', 'address', 'price', 'location'],
excludes: ['internal_notes', 'raw_data']
},
query: {
bool: {
must: this.buildMustClauses(searchParams),
filter: this.buildFilterClauses(searchParams),
should: this.buildBoostClauses(searchParams)
}
},
aggs: {
price_ranges: {
range: {
field: 'price',
ranges: [
{ key: 'budget', to: 300000 },
{ key: 'mid', from: 300000, to: 700000 },
{ key: 'luxury', from: 700000 }
]
}
}
},
sort: this.buildSortClauses(searchParams)
};
const result = await this.esClient.search({
index: 'properties',
body: query,
request_cache: true // Enable request cache
});
const searchResult = this.formatResults(result.body);
// Cache results for 5 minutes
await this.cache.setex(cacheKey, 300, JSON.stringify(searchResult));
return searchResult;
}
}
Scaling and Performance Tuning
Horizontal scaling requires careful shard distribution and resource allocation across nodes. Monitor shard sizes and query distribution to identify scaling bottlenecks.
Tune JVM garbage collection settings for your workload characteristics. G1GC generally provides good performance for Elasticsearch workloads, but ConcurrentMarkSweep might perform better for specific use cases.
Backup and Disaster Recovery
Implement automated snapshot policies that balance recovery point objectives with storage costs. Store snapshots in geographically distributed locations using cloud storage services.
Test recovery procedures regularly to ensure backup validity and measure recovery time objectives. Document step-by-step recovery procedures for different failure scenarios.
PUT _slm/policy/daily-snapshots
{
"schedule": "0 2 * * *",
"name": "<daily-snap-{now/d}>",
"repository": "s3-backup-repo",
"config": {
"indices": "properties,listings,analytics",
"ignore_unavailable": false,
"include_global_state": false,
"metadata": {
"taken_by": "automated-policy",
"taken_because": "daily backup"
}
},
"retention": {
"expire_after": "30d",
"min_count": 5,
"max_count": 50
}
}
Scaling Your Search Architecture for the Future
Building production-ready Elasticsearch infrastructure requires balancing immediate needs with future scalability requirements. The architecture patterns and implementation strategies outlined here provide a foundation for search systems that can grow with your business needs.
Successful search implementations start with understanding your data characteristics, query patterns, and performance requirements. From there, systematic attention to cluster topology, security implementation, and operational procedures ensures your search infrastructure remains reliable under production workloads.
At PropTechUSA.ai, we've implemented these patterns across property search systems handling millions of listings with sub-second response times. The key lies in treating search infrastructure as a critical system component that deserves the same engineering rigor as your primary application architecture.
Ready to implement enterprise-grade search for your application? Start with a small, well-configured cluster following these production patterns, then scale systematically as your requirements grow. The investment in proper architecture pays dividends in system reliability, performance, and operational simplicity.
Take the next step: Download our Elasticsearch production deployment checklist and reference architecture templates to accelerate your search infrastructure implementation.