Deploy Platforms Instead of Charts

acarlini1 pts0 comments

Deploy Platforms Instead Of Charts | torque

Deploy Platforms Instead Of Charts

Torque packages the stack root, profiles, app assets, public access checks, and run evidence into one repeatable release program. The proof run used real payment events and verified S3 objects, Redpanda schema contracts and offsets, ClickHouse rows, Iceberg tables, Trino queries, Spark batch output, replay/backfill, and public endpoints.

GitHub stack package<br>Packaged stackfile<br>Production values

Platform architecture<br>Payments stream in real time while batch features rebuild from the lake

The API writes raw events to Redpanda and S3. Flink scores the stream with Ray Serve, ClickHouse stores decisions, Spark commits Iceberg tables through a REST catalog, Trino exposes analyst SQL, and Argo schedules replayable feature jobs.

stream

Payments API public event ingress

raw

Redpanda payments.raw topic

consume

Flink continuous risk job

scoring

Flink feature window

score

Ray Serve risk model endpoint

write

ClickHouse payment decisions

batch

Argo scheduled workflow

run

Spark bounded feature job

persist

S3 + Iceberg curated features

telemetry

Workloads API, jobs, model service

signals

SigNoz service health surface

store

ClickHouse observability backend

The central object in this article is the Torque stack file. It is<br>not a values file for one chart. It is an ordered deployment<br>contract for a full fraud platform: host setup, Kubernetes access,<br>cloud storage, data services, application workloads, public checks,<br>batch processing, replay, and final verification.

Helmfile can coordinate Helm releases. Argo CD and Flux can keep<br>Kubernetes objects in sync with Git. Terraform and Pulumi can create<br>cloud resources. Argo Workflows can run jobs. This stack uses that<br>kind of tooling where it fits, but the stack boundary is wider. The<br>deployment starts with a remote Linux host, creates a Firecracker<br>backed k3s lab, opens a controlled tunnel for local Kubernetes<br>access, creates or checks the S3 bucket, installs the platform<br>services, deploys the fraud workloads, runs Spark, runs replay, and<br>verifies the data path from outside the cluster.

The source package lives under stacks/fraud-platform.<br>The rendered stack file is attached here as<br>stack.yaml.<br>The production shaped entrypoint is<br>stack-packaged.yaml,<br>with profile values such as<br>values-prod.yaml.<br>The lab profile expects TORQUE_LAB_SSH for the target<br>host and TORQUE_LAB_PUBLIC_IP for public endpoint<br>checks.

Platform Architecture

The platform has two data paths. The stream path starts at the<br>payments API. That API is the public event ingress. It accepts<br>generated card payment events, writes each raw event to the<br>payments.raw Redpanda topic, and stores a raw JSON object<br>in S3. Redpanda is more than a queue in this stack. It also exposes a<br>schema registry with subjects for raw payment events, risk events,<br>and payment decisions, and the verification step checks those<br>subjects for backward compatibility.

Flink consumes the raw topic and builds the scoring request. It<br>keeps the stream processor separate from the model service. Ray Serve<br>owns the model endpoint and returns the score. Flink writes the<br>decision to ClickHouse and also produces risk output back through<br>Redpanda. ClickHouse is the fast operator store in this design:<br>decisions, fraud rates, merchant behavior, and batch summaries are<br>available without reading lake files.

The batch path starts from S3. Argo submits a Spark workflow after<br>the platform is reachable. Spark reads raw payment objects and risk<br>decisions, computes aggregate fraud features, writes a curated JSON<br>artifact back to S3, inserts batch rows into ClickHouse, and commits<br>three Iceberg tables through the REST catalog:<br>raw_payments, risk_events, and<br>batch_feature_summary. Trino is the analyst surface over<br>both stores. It queries ClickHouse for low latency decisions and<br>Iceberg for lakehouse tables. SigNoz covers service health and uses<br>ClickHouse as its telemetry store.

The Kubernetes layout is also part of the architecture. The<br>platform labels Firecracker nodes by workload role: control,<br>observability, events, processing, machine learning and batch, and<br>analytics. Redpanda runs in the data namespace. Flink runs in the<br>stream namespace. Ray and Spark run in the machine learning<br>namespace. The API and generator run in the apps namespace. SigNoz<br>and its ClickHouse store run in observability. Trino and Iceberg REST<br>run on the analytics node so SQL access is separate from stream<br>processing and model serving.

Stream And Scoring

stream

Payments API public event ingress

raw

Redpanda payments.raw topic

consume

Flink continuous risk job

scoring

Flink feature window

score

Ray Serve risk model endpoint

write

ClickHouse payment decisions

Batch And Telemetry

batch

Argo scheduled workflow

run

Spark bounded feature job

persist

S3 + Iceberg curated features

telemetry

Workloads API, jobs, model service

signals

SigNoz service health...

clickhouse batch stack stream redpanda spark

Related Articles