OpenSearch Addon¶
paas ships a managed OpenSearch addon backed by the OpenSearch
Kubernetes Operator
(opensearch.opster.io/v1). Each app gets a dedicated
OpenSearchCluster CR, a generated admin password, and an
OPENSEARCH_URL Secret pre-mounted at deploy time.
At a glance¶
| Capability | Default | Where to configure |
|---|---|---|
| Engine | OpenSearch 2.13.0 | version in the create payload |
| Cluster CR | opensearch.opster.io/v1::OpenSearchCluster |
derived from app_id |
| HTTP port | 9200 | (fixed) |
| Plans | free / standard / pro |
plan in the create payload |
| RAM/heap ratio | 2:1 (container memory = 2 × JVM heap) | pinned by OpensearchPlanConfig |
| Connection user | admin (cycle 2 default — scoped app_user cycle 3+) |
not configurable |
| Connection URL | injected as OPENSEARCH_URL env var |
via os-{app_id}-url Secret |
Plans (cycle 2 corrected ratio 2:1)¶
| Plan | Nodes | JVM Heap | RAM (container) | Storage |
|---|---|---|---|---|
free |
1 | 256Mi |
512Mi |
5Gi |
standard |
1 | 1Gi |
2Gi |
20Gi |
pro |
3 | 2Gi |
4Gi |
50Gi |
The mapping lives in
paas_database::opensearch_provisioner::opensearch_plan_config.
Cycle 1 shipped RAM == heap (1:1 ratio, OOM-prone under
indexing load) — cycle 2 corrected to the documented 2:1 ratio
so the off-heap budget (Lucene file cache + direct memory +
native libs) doesn't crash the pod under load.
RAM/heap ratio is 2:1 — never less
Container memory must be 2 × JVM heap. The JVM heap
holds the young+old generations; the off-heap region needs
roughly the same budget for Lucene's file cache, JVM direct
memory, and native libraries. Sizing them equal (1:1)
starves off-heap and the pod OOM-kills under any real
indexing load. The arithmetic is pinned by
opensearch_plan_*_memory_is_2x_heap cargo tests.
Free plan: heap 256Mi is below recommended floor
free ships 256Mi heap (with 512Mi container RAM at
the 2:1 ratio). The OpenSearch operator's recommended floor
is 1Gi heap. Under any production indexing load free
will OOM — recommend_plan_upgrade("free", "opensearch")
returns Some("standard") and the create endpoint logs the
upgrade hint via tracing::info!. The dashboard surfaces
it via the polling loop.
Lifecycle¶
flowchart LR
A["POST /v1/apps/{id}/addons<br/>{type:'opensearch', plan, version}"]
A --> B["addons.rs::create_addon_generic"]
B --> C["credentials::generate_password()"]
C --> D["ensure_opensearch_url_secret<br/>os-{app}-url Secret"]
D --> E["ensure_opensearch_cluster<br/>OpenSearchCluster CR Patch::Apply"]
E --> F["OpenSearch Operator<br/>provisions StatefulSet + Service"]
F --> G["os-{app}.{ns} Service<br/>HTTP 9200"]
H["dashboard polls<br/>poll_addon_status opensearch branch"]
H --> I["get_opensearch_cluster<br/>parse_opensearch_status"]
I -- "green / yellow" --> J["app_addons.status = 'Ready'<br/>ready_at stamped"]
I -- "red" --> K["status = 'Failed'"]
I -- "_/missing" --> L["status = 'Creating'"]
OPENSEARCH_URL format¶
Materialised in the K8s Secret named os-{app_id}-url (key:
OPENSEARCH_URL). The dashboard's paas config:set integration
mounts it into the app pod's environment automatically.
Components:
http://— cycle 2 doesn't ship TLS yet (SecurityPlugin disabled by default in this POC). Cycle 3+ switches tohttps://via cert-manager-issued certs.admin— Operator-managed default user. Cycle 3+ will provision a scopedapp_uservia the SecurityPlugin REST API; the URL shape stays the same, just with a new password (no breaking change to client code).{generated_password}—paas_database::credentials::generate_password()emits a hex uuid v4 derivative with ≥ 60 chars of entropy. Never hardcoded.os-{app_id}—general.serviceNameset in the CR spec. The Operator emits a Service of the same name.paas-apps— the namespace where every tenant addon lives.9200— OpenSearch's standard HTTP wire port.
Status lifecycle¶
Operator .status.health |
app_addons.status |
Reason |
|---|---|---|
green |
Ready |
All shards active, full replicas |
yellow |
Ready |
Primary shards OK, some replicas unassigned. Queryable — better to let tenants connect and see the degraded indicator than block on a transient yellow during a node restart. |
red |
Failed |
A primary shard is missing — cluster is not queryable |
| (missing or unknown) | Creating |
Operator hasn't populated .status.health yet (defensive — no 5xx surface) |
Versions¶
2.13 (or 2.13.0) is the accepted value for version. The
default is OPENSEARCH_DEFAULT_VERSION = "2.13.0" — anything
else (or None) falls back to the default so a malformed client
payload can't ship a poisoned version to the operator.
Tests de validation (DoD)¶
APP_ID=... # your app's UUID
TOKEN=$(paas auth print-token)
PAAS_URL=https://runtime.di2amp.com
# 1. Create the OpenSearch addon
curl -sk -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
-d '{"addon_type":"opensearch","plan":"standard","version":"2.13"}' \
$PAAS_URL/v1/apps/$APP_ID/addons | jq .
# 2. Poll until Ready (≤ 5 min after operator install)
for i in 1 2 3 4 5 6 7 8 9 10; do
HEALTH=$(kubectl -n paas-apps get opensearchcluster os-$APP_ID -o jsonpath='{.status.health}' 2>/dev/null)
echo "tick $i: $HEALTH"
[[ "$HEALTH" == "green" || "$HEALTH" == "yellow" ]] && break
sleep 30
done
# 3. Verify OPENSEARCH_URL is mounted on the app pod
kubectl -n paas-apps exec deployment/app-$APP_ID-web -- env | grep OPENSEARCH_URL
# Expect: OPENSEARCH_URL=http://admin:...@os-{APP_ID}.paas-apps:9200
# 4. Smoke the cluster from inside the cluster
URL=$(kubectl -n paas-apps get secret os-$APP_ID-url -o jsonpath='{.data.OPENSEARCH_URL}' | base64 -d)
kubectl -n paas-apps run os-cli --rm -it --image=curlimages/curl --restart=Never -- \
curl -sk "$URL/_cluster/health" | jq .
# 5. Index a doc, search it, delete the index
kubectl -n paas-apps exec deployment/app-$APP_ID-web -- sh -c '
curl -sk -X POST "$OPENSEARCH_URL/test/_doc?refresh=true" -H "content-type: application/json" -d "{\"hello\":\"world\"}";
curl -sk "$OPENSEARCH_URL/test/_search" | jq .;
curl -sk -X DELETE "$OPENSEARCH_URL/test";
'
Implementation pointers¶
| Concern | File |
|---|---|
| Plan → resources mapping (ratio 2:1) | crates/database/src/opensearch_provisioner.rs::opensearch_plan_config |
| RFC-1123 cluster name | crates/database/src/opensearch_provisioner.rs::opensearch_cluster_name (cross-addon sanitize_dns_label) |
| OpenSearchCluster spec builder | crates/database/src/opensearch_provisioner.rs::build_opensearchcluster_spec |
| OPENSEARCH_URL formatter | crates/database/src/opensearch_provisioner.rs::opensearch_url |
| Secret materialiser | crates/database/src/opensearch_provisioner.rs::ensure_opensearch_url_secret |
| CR get-or-create | crates/database/src/opensearch_provisioner.rs::ensure_opensearch_cluster |
| Status projection (green+yellow → Ready) | crates/database/src/opensearch_provisioner.rs::parse_opensearch_status |
| Cross-addon plan upgrade hint | crates/database/src/opensearch_provisioner.rs::recommend_plan_upgrade |
| Route handler | crates/control-plane/src/routes/addons.rs::create_addon_generic (opensearch branch) |
| Polling loop | crates/control-plane/src/routes/addons.rs::poll_addon_status (opensearch branch) |
Cluster pre-requisites¶
The OpenSearch Operator must be installed before the addon can
be provisioned. paas's hot-fix runbook (the helm repo
opster.github.io returns 404 — direct kubectl apply against
the GitHub release works):
# Find the latest release tag
LATEST=$(curl -s https://api.github.com/repos/opensearch-project/opensearch-k8s-operator/releases/latest \
| grep tag_name | cut -d'"' -f4)
# Apply the operator manifest cluster-wide
kubectl apply -f "https://github.com/opensearch-project/opensearch-k8s-operator/releases/download/${LATEST}/opensearch-operator.yaml"
# Sanity:
kubectl get crd | grep opensearch.opster.io
# Expect: opensearchclusters.opensearch.opster.io
kubectl -n opensearch-operator-system get pods
# Expect: opensearch-operator-controller-manager-... 2/2 Running
Documented in bilans/HOTFIXES.md so the next operator running
the runbook gets the correct fallback first try.
Limits (cycle 2)¶
- SecurityPlugin — disabled by default (POC). The
adminuser is operator-managed. Cycle 3+ provisions a scopedapp_uservia the SecurityPlugin REST API. - TLS — cycle 2 ships HTTP only. cert-manager-issued certs
https://URL switch in cycle 3+.- Index management (sub-brique 42d) — out of scope for paas (cahier-confirmed). Tenants manage their own indices via the REST API.
- OpenSearch Dashboards UI — out of scope cycles 1-2.
- Snapshot/backup — out of scope.
Related concepts¶
- Add-ons — umbrella addon flow that mysql / postgres / valkey / opensearch / clickhouse all hang off.
- MySQL Addon — sister addon backed by Oracle MySQL Operator.
- ClickHouse Addon — sister addon backed by Altinity ClickHouse Operator.