Manual Scaling¶

Manual scaling lets you change two independent things on a process:

Replicas — how many pods serve the workload (horizontal).
Resources — CPU/memory request and limit per pod (vertical).

Replicas go through POST /scale; resources go through PUT /resources. The dashboard's Scale page combines both in one UI; the CLI keeps the existing paas scale web=3 for replicas and adds web=L syntax for size.

How it works¶

sequenceDiagram
  participant U as Operator
  participant CP as Control Plane
  participant DB as Postgres (paas_tenant_quotas)
  participant K as Kubernetes

  U->>CP: PUT /v1/apps/{app}/resources { process_type, size }
  CP->>CP: size_to_resources("L") → cpu=2, mem=2Gi
  CP->>DB: SELECT quota_cpu, quota_memory_mi WHERE tenant_id = …
  alt requested ≤ quota
    CP->>K: patch Deployment (resources.requests/limits)
    K->>K: rolling update (RollingUpdate strategy)
    CP-->>U: 200 OK
  else requested > quota
    CP-->>U: 422 quota_exceeded
  end

T-shirt sizes¶

Pick a size and let the platform pick CPU + memory:

Size	CPU req	Memory req	CPU limit	Memory limit
Free	0.25	256 MiB	0.5	512 MiB
S	0.5	512 MiB	1	1 GiB
M	1	1 GiB	2	2 GiB
L	2	2 GiB	4	4 GiB
XL	4	4 GiB	8	8 GiB
2XL	8	8 GiB	16	16 GiB

An unknown size silently falls back to Free so a typo can't push a pod into a bucket the cluster can't satisfy.

You can also bypass the catalogue and pass cpu / memory directly:

curl -X PUT https://runtime.di2amp.com/api/v1/apps/$APP/resources \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"process_type": "web", "cpu": "750m", "memory": "768Mi"}'

When size is set, it takes precedence over the per-field overrides. Without an explicit limit, the request value doubles to a safe limit (QoS stays Burstable, but a runaway pod can't starve its neighbours).

Quota policy¶

Every tenant has a CPU and a memory cap stored in paas_tenant_quotas. Defaults — applied when the tenant has no row yet — are 4 CPU / 4096 MiB, which fits any size up to L inclusive. Pushing past the cap returns:

{
  "error": {
    "code": "quota_exceeded",
    "message": "quota_exceeded: cpu requested=8 available=4"
  }
}

with HTTP 422 Unprocessable Entity. The dashboard surfaces this as a toast: "Quota exceeded — upgrade your plan or pick a smaller size."

To raise a tenant's cap (operator-only, billing flow lands later):

INSERT INTO paas_tenant_quotas (tenant_id, quota_cpu, quota_memory_mi)
VALUES ('acme', 16, 32768)
ON CONFLICT (tenant_id) DO UPDATE
  SET quota_cpu = EXCLUDED.quota_cpu,
      quota_memory_mi = EXCLUDED.quota_memory_mi,
      updated_at = NOW();

UI¶

Open /apps/{id}/scale. Each process gets its own card:

Status line: 2/2 ready · 0.25 CPU · 256Mi
Size dropdown: Free / S / M / L / XL / 2XL
Apply button (disabled if the picked size matches the current).

Apply triggers PUT /resources; on success the page refreshes the scale query and toasts "Resources updated". On quota_exceeded (422), the toast tells the operator how to recover.

CLI¶

Replicas — same as before:

paas scale web=3

Size (cycle 2 helper, full CLI wire-up lands in cycle 3):

# parses to ScalePair::Size { process: "web", size: "L" }
paas scale web=L

web=Lg (typo) returns "invalid value: Lg (expected integer or one of Free/S/M/L/XL/2XL)" immediately — no silent coerce.

API endpoints¶

Verb	Path	Body	Notes
GET	`/v1/apps/{id}/scale`	—	Returns one row per process: `{type, replicas, ready, size, cpu, memory}`
POST	`/v1/apps/{id}/scale`	`{process_type, replicas}`	Replicas only (existing)
PUT	`/v1/apps/{id}/resources`	`{process_type, size?, cpu?, memory?}`	Resources, runs quota admission

Rolling Deploy — the engine that applies the resource patch with zero downtime
Blueprint paas.toml — declare default resources up-front in [resources] and [scaling]