Jobs

Zatabase includes a built-in job orchestrator for running compute tasks alongside your data. Jobs execute as native processes or Docker containers, with real-time log streaming, artifact management, and cancellation support. This eliminates the need for a separate job queue (Celery, Sidekiq, Bull) for data-adjacent workloads.

Submitting a Job

Create a job by specifying the command, working directory, and optional environment variables:

curl -s -X POST https://your-project.zatabase.io/v1/jobs \
  -H "Authorization: Bearer $ZATABASE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "nightly-export",
    "command": "python3 export.py --format parquet",
    "working_dir": "/opt/scripts",
    "environment": {
      "OUTPUT_DIR": "/tmp/exports",
      "ZATABASE_TOKEN": "'"$ZATABASE_TOKEN"'"
    },
    "timeout_seconds": 3600
  }' | jq

The response includes a job ID that you use to monitor progress:

{
  "job_id": "01HQR...",
  "status": "queued",
  "created_at": "2026-03-01T12:00:00Z"
}

Job Lifecycle

queued -> running -> completed
                  -> failed
                  -> cancelled

Jobs transition through these states automatically. A queued job is picked up by the next available worker. Running jobs stream their stdout and stderr to the log endpoint. When the process exits, the job moves to completed (exit code 0) or failed (nonzero exit code).

Monitoring Jobs

List Jobs

curl -s https://your-project.zatabase.io/v1/jobs \
  -H "Authorization: Bearer $ZATABASE_TOKEN" | jq

Get Job Status

curl -s https://your-project.zatabase.io/v1/jobs/{job_id} \
  -H "Authorization: Bearer $ZATABASE_TOKEN" | jq

Stream Logs (WebSocket)

Connect to the WebSocket endpoint for real-time log output:

websocat "wss://your-project.zatabase.io/v1/jobs/{job_id}/logs?token=$ZATABASE_TOKEN"

Each message is a JSON object with stream (stdout or stderr), line, and timestamp fields.

Cancellation

Cancel a running job with a configurable grace period:

curl -s -X POST https://your-project.zatabase.io/v1/jobs/{job_id}/cancel \
  -H "Authorization: Bearer $ZATABASE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"timeout_seconds": 10}'

Zatabase sends SIGTERM to the process and waits for the specified timeout. If the process does not exit, it is killed with SIGKILL.

Artifacts

Jobs can produce artifacts that are stored in Zatabase’s content-addressed filesystem. Artifacts are indexed by SHA-256 hash, so identical files are stored only once.

Upload an Artifact

curl -s -X POST https://your-project.zatabase.io/v1/jobs/{job_id}/artifacts \
  -H "Authorization: Bearer $ZATABASE_TOKEN" \
  -F "file=@output.parquet"

List Artifacts

curl -s https://your-project.zatabase.io/v1/jobs/{job_id}/artifacts \
  -H "Authorization: Bearer $ZATABASE_TOKEN" | jq

Download an Artifact

curl -O https://your-project.zatabase.io/v1/jobs/{job_id}/artifacts/{artifact_id} \
  -H "Authorization: Bearer $ZATABASE_TOKEN"

Worker Modes

The ZWORKER_MODE environment variable controls how jobs are executed:

Mode	Description
`auto`	Detect Docker availability; fall back to local if unavailable
`local`	Execute jobs as native child processes
`docker`	Execute jobs inside Docker containers

Local mode is fastest and simplest for trusted workloads. Docker mode provides process isolation, resource limits, and reproducible environments for untrusted or multi-tenant workloads.

Note: In production deployments, container-based job execution is handled by ZLayer, Zatabase’s native orchestration fabric. The docker worker mode is primarily intended for local development and testing.

Use Cases

ETL pipelines: Export data from Zatabase, transform it, and import results back
ML training: Run training jobs close to the data, store model artifacts in Zatabase
Report generation: Scheduled report generation with artifact storage
Data validation: Run validation scripts against ingested data and flag anomalies