Distributed execution layer · partial early access

Containerized workload execution on distributed compute infrastructure

Submit Docker-based workloads. The system routes each job across heterogeneous nodes—GPU for training and inference, CPU for batch and backend, storage-backed for data-heavy pipelines—and runs it in an isolated environment.

Built for Docker-based reproducible workloads
forge submit
$ forge submit --image registry/io:train-v3 \
    --type training --gpu true

→ resolving container ........... ok
→ selecting node ............ gpu-a100-x4
→ scheduling job ........... job_8f12c4ad
→ status .................... running

Why this exists

Compute infrastructure is fragmented

Workloads are split across providers that do not talk to each other. Each one solves a slice of the problem and leaves the routing, packaging, and placement to you. We consolidate them behind a single execution layer that accepts a container and decides where it should run.

GPU clouds

Capacity locked to one vendor, billed per instance.

VPS providers

General-purpose CPU, no scheduling for batch jobs.

Isolated execution systems

One-off runners with no routing or portability.

forgehq.run

One unified execution layer. Submit a container; it runs on the right node.

Supported workloads

What you can run today

ML training

PyTorch, JAX, and Diffusers training runs on GPU-backed nodes.

pytorchjaxdiffusers

Inference pipelines

Model serving and batched inference with GPU acceleration.

servingbatched

Batch data processing

CPU-heavy ETL, transforms, and scheduled batch jobs.

etlbatch

Docker-based backend workloads

Any reproducible container that runs to completion.

dockeroci

Non-containerized workloads may not be supported. Jobs that cannot be packaged into a Docker or equivalent container format will be rejected during intake.

Execution model

How a job moves through the system

  1. 01

    Submit containerized workload

    Provide a Docker image or repository with a buildable container definition.

  2. 02

    System selects optimal compute node

    The scheduler matches your job to GPU, CPU, or storage-backed capacity.

  3. 03

    Execution runs in isolated environment

    Each job runs sandboxed, with no shared state across tenants.

  4. 04

    Output returned securely

    Artifacts and logs are returned over an authenticated channel.

// Routing decisions are based on performance needs and cost efficiency—not on which hardware you happen to rent.

Compute abstraction

Heterogeneous compute, one interface

You describe the work. The system selects the node class. There is no instance picker and no per-vendor capacity to manage.

active

GPU nodes

ML training / inference

active

CPU nodes

Batch and backend workloads

future

Storage nodes

Data-heavy pipelines

The system routes workloads, not hardware rentals. You never select an instance type—the scheduler places each job on the node class that fits its profile.

Pricing

Billed by what you execute

CPU workloads

$0.04/ vCPU-hr

Batch and backend jobs

  • Standard scheduling
  • Isolated execution
  • Logs + artifacts returned

GPU workloads

$1.80/ GPU-hr

Training and inference

  • GPU-backed nodes
  • Isolated execution
  • Logs + artifacts returned

Priority execution

Custom

Preferred scheduling tier

  • Front-of-queue placement
  • Reserved capacity
  • Direct intake review

Rates are indicative during early access and depend on node availability and job profile.

Submit workload

Send a reproducible job

Intake is reviewed during early access. The more precise your container and execution type, the faster it gets scheduled.

Workloads must be reproducible via Docker or an equivalent container format.

Optional compute hints