You Don't Need Kubernetes

Introduction
In the modern DevOps landscape, Kubernetes (K8s) has become the unspoken default. It’s the resume-padder, the “industry standard,” and the tool every VC-backed startup assumes they need to reach mythical “Google scale.”
But here is the uncomfortable truth: For 90% of companies, Kubernetes is a premature optimization that burns money and developer cycles.
When we built Sensyze DataFlow—a complex engine requiring FastAPI, Next.js, Temporal workflows, and Dask distributed computing—we felt the pressure to adopt GKE (Google Kubernetes Engine). We crunched the numbers, looked at the operational complexity, and realized that managing K8s would require a dedicated platform team we didn’t have.
Instead, we chose the Slim Stack: Nomad + Consul + OpenTofu.
This guide is not a hot take. It is a data-driven, practical roadmap on how to build production-grade infrastructure that handles millions of requests without the Kubernetes tax.
1. The Kubernetes Tax: By The Numbers
Before looking at code, let’s look at the cold, hard data. Why are companies like Roblox, Trivago, and Cloudflare running Nomad at massive scale?
The “Control Plane” Overhead
Kubernetes isn’t just software; it’s a distributed operating system. To run it, you pay a “tax” in both compute and human hours.
| Feature | GKE / EKS (K8s) | Nomad + Consul |
|---|---|---|
| Control Plane Cost | ~$72/mo (Cluster Fee) + System Nodes | $0 (Runs on your existing nodes) |
| Agent RAM Usage | ~1.5 GB - 2 GB (Kubelet + Proxy) | ~60-100 MB (Nomad + Consul Agents) |
| Minimum Cluster | 3 Dedicated Master Nodes | 3 Shared Server Nodes (can run workloads) |
| Updates | Risky, often requires node rotation | Single binary swap (yum update nomad) |
The Impact: On a small cluster of 5 nodes, K8s eats ~10GB of RAM just to exist. Nomad eats ~500MB. That’s essentially a free extra server you are paying for but not using.
Real-World Savings
- HashiCorp Data: Running 2 million containers on Nomad with Spot instances resulted in a 68% cost reduction compared to standard on-demand K8s clusters.
- NetApp Spotinst: Reports that optimizing efficient orchestrators with Spot instances typically saves 60-80% on compute bills.
2. The Migration Guide: Building the Slim Stack
We use OpenTofu (the open-source fork of Terraform) to define everything. This is “Infrastructure as Code” deeply integrated with the idea of Immutable Infrastructure.
Step 1: The Infrastructure (OpenTofu)
We don’t need fancy K8s Node Pools. We need simple, cheap Virtual Machines. We specifically use Spot Instances because Nomad handles preemption (node death) significantly faster than K8s.
Why Nomad Wins on Spot: Kubernetes often takes minutes to realize a node is gone and reschedule pods. Nomad’s gossip protocol detects failure in milliseconds and rescheduling happens instantly.
main.tf - Provisioning a Robust Spot Instance
resource "google_compute_instance" "nomad_client" {
name = "nomad-worker-spot-${count.index}"
machine_type = "e2-medium"
zone = "us-central1-a"
count = 3
# Critical: Graceful shutdown script for Spot preemption
metadata = {
shutdown-script = "#!/bin/bash\n nomad node drain -enable -self -deadline 60s"
}
metadata_startup_script = file("scripts/startup.sh")
scheduling {
preemptible = true
automatic_restart = false
provisioning_model = "SPOT"
}
service_account {
scopes = ["cloud-platform"]
}
}
Step 2: The Networking (Consul)
In Kubernetes, networking is a beast involving CNI plugins, CoreDNS, and Ingress Controllers. In our stack, Consul handles it all with zero magic.
Consul provides two things:
- Service Discovery: “Where is the Redis container?”
- Service Mesh (Optional): “Encrypt traffic between API and DB.”
How it works effectively:
We run a local DNS forwarder (dnsmasq) on every node. When your app tries to reach redis.service.consul, the local agent resolves it instantly. No central bottleneck, no complex iptables rules.
Step 3: The Orchestrator (Nomad)
Nomad uses HCL, which is readable by humans, not just machines. A K8s Deployment + Service + Ingress + ConfigMap is easily 100+ lines of YAML. In Nomad, it’s one file.
dataflow.nomad.hcl - A Production Job
This job defines our API, including:
- Canary Deployments: Test new versions before full rollout.
- Auto-promotion: Only switch traffic if health checks pass.
- Tags: Automatically configures Traefik load balancing.
job "dataflow-api" {
datacenters = ["dc1"]
type = "service"
# Rolling updates with Canary
update {
max_parallel = 1
canary = 1
min_healthy_time = "10s"
healthy_deadline = "5m"
auto_revert = true
auto_promote = true
}
group "api" {
count = 3 # High Availability
network {
port "http" { to = 8000 }
}
service {
name = "dataflow-api"
port = "http"
# Integration with Traefik (Load Balancer)
tags = [
"traefik.enable=true",
"traefik.http.routers.api.rule=Host(`api.appler.xyz`)",
"traefik.http.routers.api.tls.certresolver=myresolver"
]
check {
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
}
}
task "server" {
driver = "docker"
config {
image = "sensyze/dataflow-api:v2.1.0"
ports = ["http"]
}
# Inject secrets at runtime - No K8s Secrets complexity
template {
data = <<EOH
DATABASE_URL="{{ with secret "secret/data/db" }}{{ .Data.data.url }}{{ end }}"
REDIS_HOST="redis.service.consul"
EOH
destination = "secrets/file.env"
env = true
}
resources {
cpu = 500 # MHz
memory = 256 # MB
}
}
}
}
3. The “Missing Manual”: Day 2 Operations
Most tutorials stop at “Hello World.” Here is how you actually run this.
Persistent Storage (The CSI Question)
“Stateless is easy, but where do I put my Database?” Nomad supports CSI (Container Storage Interface) plugins, just like Kubernetes. You can mount AWS EBS or GCP Persistent Disks seamlessly.
# Volume registration
volume "mysql-data" {
type = "csi"
id = "mysql-vol"
read_only = false
attachment_mode = "file-system"
access_mode = "single-node-writer"
}
However, for 90% of startups, we recommend using managed databases (RDS, Cloud SQL). Don’t run your primary DB in any orchestrator if you are a small team. It’s not worth the sleep you’ll lose.
CI/CD Pipeline
You don’t need ArgoCD. A simple GitHub Action is enough to deploy to Nomad.
# .github/workflows/deploy.yml
name: Deploy to Nomad
on:
push:
branches: [ "main" ]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install Nomad
run: |
wget -O nomad.zip https://releases.hashicorp.com/nomad/1.6.0/nomad_1.6.0_linux_amd64.zip
unzip nomad.zip && chmod +x nomad && sudo mv nomad /usr/local/bin/
- name: Deploy Job
env:
NOMAD_ADDR: ${{ secrets.NOMAD_ADDR }}
NOMAD_TOKEN: ${{ secrets.NOMAD_TOKEN }}
run: |
nomad job run -detach jobs/dataflow-api.nomad.hcl
4. Case Studies: Who is actually doing this?
You won’t be alone. Some of the world’s leanest (and largest) engineering teams choose this stack.
Roblox: Scale without Sprawl
Roblox runs one of the largest Nomad deployments in the world, managing over 10,000 nodes.
- Why? They needed an orchestrator that was simpler to operate than Kubernetes at massive scale.
- Result: They achieved “single pane of glass” management for their game servers globally without the crushing complexity of managing 100+ K8s control planes.
Trivago: Efficiency & Isolation
Trivago migrated from bare-metal scripts to Nomad.
- Why? They wanted the benefits of container scheduling (bin packing, isolation) without the overhead.
- Key Quote: “Nomad is a single binary… It’s easy to understand, easy to debug, and easy to operate.” — Matthias Endler, Trivago.
Cloudflare: The Edge
Cloudflare uses Nomad to schedule workloads across their massive edge network in 200+ cities. K8s was simply too heavy to run on every single edge pop, whereas Nomad’s minimal footprint fit perfectly.
Conclusion
Kubernetes is a masterpiece of engineering, but it solves problems you probably don’t have yet.
- Do you have 50+ engineering teams needing namespace isolation?
- Do you have complex compliance policies requiring OPA gatekeepers?
- Do you truly need custom CRDs for your operators?
If the answer is “No,” then Nomad is your cheat code. It gives you the velocity of a Platform-as-a-Service (like Heroku) with the control and cost-savings of bare metal.
Simplify your stack. Ship faster. Sleep better.