Skip to content

Manual Scaling

Manual scaling lets you change two independent things on a process:

  • Replicas — how many pods serve the workload (horizontal).
  • Resources — CPU/memory request and limit per pod (vertical).

Replicas go through POST /scale; resources go through PUT /resources. The dashboard's Scale page combines both in one UI; the CLI keeps the existing paas scale web=3 for replicas and adds web=L syntax for size.

How it works

sequenceDiagram
  participant U as Operator
  participant CP as Control Plane
  participant DB as Postgres (paas_tenant_quotas)
  participant K as Kubernetes

  U->>CP: PUT /v1/apps/{app}/resources { process_type, size }
  CP->>CP: size_to_resources("L") → cpu=2, mem=2Gi
  CP->>DB: SELECT quota_cpu, quota_memory_mi WHERE tenant_id = …
  alt requested ≤ quota
    CP->>K: patch Deployment (resources.requests/limits)
    K->>K: rolling update (RollingUpdate strategy)
    CP-->>U: 200 OK
  else requested > quota
    CP-->>U: 422 quota_exceeded
  end

T-shirt sizes

Pick a size and let the platform pick CPU + memory:

Size CPU req Memory req CPU limit Memory limit
Free 0.25 256 MiB 0.5 512 MiB
S 0.5 512 MiB 1 1 GiB
M 1 1 GiB 2 2 GiB
L 2 2 GiB 4 4 GiB
XL 4 4 GiB 8 8 GiB
2XL 8 8 GiB 16 16 GiB

An unknown size silently falls back to Free so a typo can't push a pod into a bucket the cluster can't satisfy.

You can also bypass the catalogue and pass cpu / memory directly:

curl -X PUT https://runtime.di2amp.com/api/v1/apps/$APP/resources \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"process_type": "web", "cpu": "750m", "memory": "768Mi"}'

When size is set, it takes precedence over the per-field overrides. Without an explicit limit, the request value doubles to a safe limit (QoS stays Burstable, but a runaway pod can't starve its neighbours).

Quota policy

Every tenant has a CPU and a memory cap stored in paas_tenant_quotas. Defaults — applied when the tenant has no row yet — are 4 CPU / 4096 MiB, which fits any size up to L inclusive. Pushing past the cap returns:

{
  "error": {
    "code": "quota_exceeded",
    "message": "quota_exceeded: cpu requested=8 available=4"
  }
}

with HTTP 422 Unprocessable Entity. The dashboard surfaces this as a toast: "Quota exceeded — upgrade your plan or pick a smaller size."

To raise a tenant's cap (operator-only, billing flow lands later):

INSERT INTO paas_tenant_quotas (tenant_id, quota_cpu, quota_memory_mi)
VALUES ('acme', 16, 32768)
ON CONFLICT (tenant_id) DO UPDATE
  SET quota_cpu = EXCLUDED.quota_cpu,
      quota_memory_mi = EXCLUDED.quota_memory_mi,
      updated_at = NOW();

UI

Open /apps/{id}/scale. Each process gets its own card:

  • Status line: 2/2 ready · 0.25 CPU · 256Mi
  • Size dropdown: Free / S / M / L / XL / 2XL
  • Apply button (disabled if the picked size matches the current).

Apply triggers PUT /resources; on success the page refreshes the scale query and toasts "Resources updated". On quota_exceeded (422), the toast tells the operator how to recover.

CLI

Replicas — same as before:

paas scale web=3

Size (cycle 2 helper, full CLI wire-up lands in cycle 3):

# parses to ScalePair::Size { process: "web", size: "L" }
paas scale web=L

web=Lg (typo) returns "invalid value: Lg (expected integer or one of Free/S/M/L/XL/2XL)" immediately — no silent coerce.

API endpoints

Verb Path Body Notes
GET /v1/apps/{id}/scale Returns one row per process: {type, replicas, ready, size, cpu, memory}
POST /v1/apps/{id}/scale {process_type, replicas} Replicas only (existing)
PUT /v1/apps/{id}/resources {process_type, size?, cpu?, memory?} Resources, runs quota admission
  • Rolling Deploy — the engine that applies the resource patch with zero downtime
  • Blueprint paas.toml — declare default resources up-front in [resources] and [scaling]