Batch Processing Example
This example demonstrates deploying batch processing workloads including scheduled jobs and continuous workers.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Batch Processing │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Scheduled │ │ Queue │ │ Periodic │ │
│ │ Jobs │ │ Workers │ │ Jobs │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Message Queue │ │
│ │ (RabbitMQ) │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Application Definition
apiVersion: core.oam.dev/v1alpha2
kind: Application
metadata:
name: batch-processing
labels:
environment: production
spec:
components:
# Continuous Queue Worker
- name: queue-worker
type: batchworkload.nomad.oam.dev
properties:
driver: docker
image: worker/queue-processor:latest
args:
- --queue
- default
- --workers
- "4"
resources:
cpu: 2000m
memory: 2048Mi
restartPolicy:
attempts: 3
interval: 5m
delay: 30s
mode: fail
traits:
- type: scaler
properties:
replicas: 5
min: 2
max: 10
- type: migration
properties:
maxParallel: 1
healthCheck: task_states
- type: servicediscovery.nomad.oam.dev
properties:
serviceName: queue-worker
tags:
- worker
- queue
port: metrics
# Image Processing Worker
- name: image-processor
type: batchworkload.nomad.oam.dev
properties:
driver: docker
image: worker/image-processor:v2.0
args:
- process
- --input
- s3://images/input
- --output
- s3://images/output
- --resize
resources:
cpu: 4000m
memory: 4096Mi
restartPolicy:
attempts: 2
interval: 10m
delay: 1m
mode: fail
traits:
- type: scaler
properties:
replicas: 2
min: 1
max: 4
# Daily Report Job (Cron)
- name: daily-report
type: cron-task
properties:
image: jobs/daily-report:v1.5
schedule: "0 2 * * *" # 2 AM daily
args:
- generate
- --date
- yesterday
- --format
- pdf
resources:
cpu: 1000m
memory: 1024Mi
# Hourly Analytics Job (Cron)
- name: hourly-analytics
type: cron-task
properties:
image: jobs/analytics:v2.0
schedule: "0 * * * *" # Every hour
args:
- analyze
- --window
- 1h
- --output
- s3://analytics/hourly
resources:
cpu: 2000m
memory: 2048Mi
# Weekly Backup Job (Cron)
- name: weekly-backup
type: cron-task
properties:
image: jobs/backup:v1.0
schedule: "0 3 * * 0" # Sunday 3 AM
args:
- backup
- --all-databases
- --compress
resources:
cpu: 2000m
memory: 4096Mi
volumes:
- source: backup-config
destination: /config
readOnly: true
traits:
- type: volume
properties:
name: backup-config
type: csi
source: backup-config
mountPath: /config
# Data Import Job (One-shot)
- name: data-import
type: task
properties:
driver: docker
image: jobs/data-import:v3.0
args:
- import
- --source
- s3://data/import
- --target
- postgres://prod-db:5432/analytics
- --batch-size
- "1000"
resources:
cpu: 4000m
memory: 8192Mi
restartPolicy:
attempts: 1
mode: fail
# ML Training Job (Long-running batch)
- name: ml-training
type: batchworkload.nomad.oam.dev
properties:
driver: docker
image: ml/trainer:v1.0
args:
- train
- --model
- recommendation
- --epochs
- "100"
- --data
- s3://ml-data/training
resources:
cpu: 8000m
memory: 16384Mi
restartPolicy:
attempts: 0
scopes:
- scopeRef:
kind: networkscope.nomad.oam.dev
name: batch-network
properties:
networkMode: bridge
- scopeRef:
kind: nodepool.nomad.oam.dev
name: batch-pool
properties:
poolName: batch-pool
datacenter:
- dc1
- dc2
nodeClass: batch
- scopeRef:
kind: namespace.nomad.oam.dev
name: batch-ns
properties:
namespace: batch-processing
quota: batch-quota
Job Types
Continuous Workers
Run continuously, processing items from a queue:
Cron Jobs
Run on a schedule:
One-shot Jobs
Run once and exit:
Scaling
Workers can scale based on queue depth:
traits:
- type: scaler
properties:
replicas: 5
min: 2
max: 10
scaleMetric: custom # Custom metric from queue
scaleTarget: 100 # Process when queue > 100
Resource Planning
| Job Type | CPU | Memory | Use Case |
|---|---|---|---|
| Queue Worker | 2000m | 2Gi | Continuous |
| Image Processor | 4000m | 4Gi | Burst |
| ML Training | 8000m | 16Gi | GPU workloads |
| Data Import | 4000m | 8Gi | Large data |