Amazon OpenSearch (Elasticsearch): a practical guide for fast, scalable search

Sep 12, 2025
awsopensearchelasticsearchsearch
0

Amazon OpenSearch Service (formerly Amazon Elasticsearch) is a managed search and analytics engine. It shines for full‑text search, log analytics, metrics, APM, and now vector search with k‑NN.

When OpenSearch is the right tool

  • Need fast text search (prefix, fuzzy, relevance, highlighting)
  • Query/aggregate semi‑structured JSON at scale
  • Time‑series analytics (logs, metrics) with rollups/ILM
  • Vector similarity search for RAG/semantic features

If you just need structured filters/joins, start with Postgres. If you need search relevance or large time‑series analytics, OpenSearch fits.

Core building blocks

  • Index: Logical collection of documents. Choose shards/replicas per index.
  • Document: JSON you index and query.
  • Mapping: Field types and analyzer config (e.g., text vs keyword).
  • Analyzer: Tokenization + filters; controls search behavior.
  • ILM: Index lifecycle (hot → warm → cold → delete) for cost control.
  • k‑NN: Vector fields and ANN indexes for semantic search.

Typical architectures

  • Search for app data: App → (queue/stream) → indexer → OpenSearch; app queries via REST/SDK.
  • Logs/metrics: App/agents → OpenSearch Ingestion / Firehose / Logstash → OpenSearch Dashboards.
  • RAG: Embed text (SageMaker/Bedrock/OpenAI) → store vectors in OpenSearch k‑NN; hybrid BM25 + vector queries.

Index design and mappings

Choose field types deliberately:

  • text for full‑text search (with analyzer). Use keyword sibling for exact filters/sorts.
  • Dates as date. IDs as keyword. Numbers with appropriate numeric types.
  • For arrays, OpenSearch treats each element as a separate value.

Minimal example mapping:

PUT my_articles
{
  "settings": { "number_of_shards": 3, "number_of_replicas": 1 },
  "mappings": {
    "properties": {
      "title":   { "type": "text", "analyzer": "standard", "fields": { "raw": { "type": "keyword" } } },
      "content": { "type": "text" },
      "tags":    { "type": "keyword" },
      "published_at": { "type": "date" },
      "embedding": { "type": "knn_vector", "dimension": 768 }
    }
  }
}

Query example (hybrid: keyword filter + text relevance):

POST my_articles/_search
{
  "query": {
    "bool": {
      "filter": { "terms": { "tags": ["aws", "search"] } },
      "must":   { "multi_match": { "query": "open search performance", "fields": ["title^3", "content"] } }
    }
  },
  "highlight": { "fields": { "content": {} } }
}

Vector similarity (k‑NN cosine):

POST my_articles/_search
{
  "size": 10,
  "query": {
    "knn": {
      "embedding": {
        "vector": [0.12, -0.04, ...],
        "k": 10
      }
    }
  }
}

Performance tuning that matters

  • Shards: Start small. 1–3 primary shards per index is typical. Too many shards wastes memory and hurts query speed.
  • Replicas: 1 replica for HA. Increase only to scale reads.
  • Routing: If you have natural partitions (tenant ID), use custom routing to keep related docs in one shard.
  • Refresh interval: Increase to 10–30s for heavy write throughput (default 1s). For log analytics, use -1 during backfills then restore.
  • Doc model: Prefer denormalization; avoid parent/child except when truly needed.
  • Avoid deep pagination: Use search_after instead of from/size beyond a few thousand.
  • Warmers & caches: Pin frequent filters with filter context; OpenSearch caches filter bitsets.

Cost control playbook

  • ILM: Hot (SSD) → warm (less CPU) → cold (UltraWarm/Cold) → delete after retention.
  • Right‑size: Choose memory‑optimized for aggregations, storage‑optimized for logs. Use Graviton where available.
  • Compression: Use best_compression for long‑lived analytic indices.
  • Serverless: For spiky/low‑ops workloads, consider OpenSearch Serverless to offload capacity planning.

Security and access

  • VPC‑only access when possible.
  • Use IAM or Cognito; enable fine‑grained access control with document/field level security for multi‑tenant.
  • Enforce HTTPS, rotate master user, restrict IPs, and turn on audit logs for sensitive data.

Operations & monitoring

  • CloudWatch metrics: CPU, JVMMemoryPressure, MasterNotDiscovered, ClusterStatus. Alert early.
  • Slow logs: Enable index/query slow logs to spot bad queries/mappings.
  • Snapshots: Automated S3 snapshots for DR; practice restore.
  • Versioning: Plan blue/green domain upgrades; test queries against the new version before cutover.

Ingestion options

  • OpenSearch Ingestion (OSI) – managed, scalable pipelines.
  • Kinesis Firehose – easy for logs/metrics.
  • Logstash/Beats/Fluent Bit – agent‑based shipping.
  • Lambda indexers – transform app data, enrich with embeddings, then index.

Quick checklist for production

  • Correct mappings (text + keyword for titles, dates as date)
  • ILM with retention, hot→warm→cold
  • Shards sized to data (tens of GB per shard; avoid thousands of shards)
  • VPC + FGAC, IAM auth, audit logs
  • Slow logs + CloudWatch alarms
  • Snapshots to S3 (tested restore)
  • Hybrid search (BM25 + vector) if using semantic features

OpenSearch can deliver snappy search and scalable analytics, but it rewards deliberate index design, shard planning, and lifecycle/cost tuning. Start lean, measure, and iterate.