# Azure Container Apps Jobs implementation guide: production design for batch, schedule (cron), and event-driven

> An implementation guide to designing Azure Container Apps Jobs at production quality. With az CLI/ARM faithful to the official Microsoft Learn docs, it explains the three triggers Manual/Schedule/Event, cron expressions (UTC), the replicaTimeout/retry/parallelism/completion settings, KEDA event-driven jobs, self-hosted CI Runners, idempotent design, and monitoring execution history.

- Published: 2026-06-26
- Author: 友田 陽大
- Tags: Azure, Container Apps, ジョブ, バッチ, KEDA, 信頼性, 冪等性, アーキテクチャ設計
- URL: https://tomodahinata.com/en/blog/azure-container-apps-jobs-batch-scheduled-event-driven-guide
- Category: Azure Container Apps in production
- Pillar guide: https://tomodahinata.com/en/blog/azure-container-apps-production-guide

## Key points

- Container Apps Jobs are tasks that 'run for a finite duration and stop.' They share the same environment, network, and logs as apps (resident services), with three trigger types: Manual, Schedule (cron), and Event (KEDA).
- Jobs have neither Ingress nor Dapr. Since they presuppose retries (replicaRetryLimit), build the job body to be idempotent too — a design that doesn't break processing the same event twice is essential.
- The settings' leads are replicaTimeout (max wait for completion), replicaRetryLimit (0 = no retry), parallelism (number of parallel replicas), and replicaCompletionCount (completions needed for success). Most are parallelism=1, completion=1.
- An event-driven job triggers 1 event = 1 execution with KEDA. It's also the staple implementation for self-hosted GitHub Actions Runners / Azure Pipelines agents.
- Cron expressions are evaluated in UTC. The execution history of schedule/event jobs is limited to the most recent 100. Query detailed logs in Log Analytics.

---

"I want to generate a report overnight," "I want to handle processing piled up in a queue," "I want to run my own CI Runner" — processing that **runs and finishes** rather than a resident service is Azure Container Apps Jobs' domain. Though handled on the same platform as resident apps, a design mistake causes **double execution, dropped work, and infinite retries.**

This article explains the **production design** of Container Apps Jobs, faithful to the [Microsoft Learn Jobs documentation](https://learn.microsoft.com/en-us/azure/container-apps/jobs). I've run [SQS-driven idempotent batches](/blog/aws-sqs-lambda-eventbridge-idempotent-async-processing-guide) in production on AWS. The principle "make retries the normal case = build idempotently" is the same on Azure. For ACA as a whole, see the [Azure Container Apps production-operations guide](/blog/azure-container-apps-production-guide).

---

## What a job is: a task that runs and finishes

> Azure Container Apps jobs enable you to run containerized tasks that run for a finite duration and then stop. (— [Jobs in Azure Container Apps](https://learn.microsoft.com/en-us/azure/container-apps/jobs))

Apps (resident services) and jobs run in **the same environment** and share network and logs. The difference is "whether it finishes."

| | App | Job |
|--|-------------|--------------|
| Nature | Runs **continuously** | Runs for a **finite duration** and stops |
| On failure | Auto-restarts if the container drops | A non-zero exit is a failure. Retry configurable |
| Example | HTTP API, web app, resident worker | Nightly batch, data migration, processing one queue item |

### App vs job (official examples)

| What you want to do | Choose |
|------------|-----|
| An HTTP server returning web content/API | **App** (HTTP scale rule) |
| Generate a report every night | **Job** (Schedule trigger + cron) |
| Continuously process a Service Bus queue | **App** (custom scale rule) |
| Process one item / a small batch from a queue and stop | **Job** (Event trigger) |
| Background processing that starts on demand and finishes | **Job** (Manual trigger) |
| A self-hosted CI Runner | **Job** (Event trigger) |

---

## The three triggers

> A job's trigger type determines how the job is started. (— [Jobs](https://learn.microsoft.com/en-us/azure/container-apps/jobs))

- **Manual**: on demand (CLI, portal, ARM API).
- **Schedule**: periodic with a cron expression.
- **Event**: triggered by an event via a KEDA scaler.

### Manual: on-demand execution

For one-off processing like data migration. Create it and start it when needed.

```azurecli
az containerapp job create \
  --name migrate-job --resource-group my-rg --environment my-env \
  --trigger-type "Manual" \
  --replica-timeout 1800 --replica-retry-limit 0 \
  --replica-completion-count 1 --parallelism 1 \
  --image myregistry.azurecr.io/migrate:2026-06-26-a1b2c3d \
  --cpu "0.5" --memory "1.0Gi"

# 起動（設定の上書きも可能）
az containerapp job start --name migrate-job --resource-group my-rg
```

> You can **override** settings at start time (run the same job on different input by changing env vars or the start command). But `When you override a configuration, the job's entire template configuration is replaced` — **the entire template configuration is replaced**, so include all needed settings.

### Schedule: cron (UTC)

Like report generation every midnight.

```azurecli
az containerapp job create \
  --name nightly-report --resource-group my-rg --environment my-env \
  --trigger-type "Schedule" --cron-expression "0 0 * * *" \
  --replica-timeout 1800 --replica-retry-limit 1 \
  --replica-completion-count 1 --parallelism 1 \
  --image myregistry.azurecr.io/report:2026-06-26-a1b2c3d \
  --cpu "0.5" --memory "1.0Gi"
```

Cron-expression examples (standard 5 fields):

| Expression | Meaning |
|----|-----|
| `*/5 * * * *` | Every 5 minutes |
| `0 */2 * * *` | Every 2 hours |
| `0 0 * * *` | Daily at midnight |
| `0 0 * * 0` | Every Sunday at midnight |
| `0 0 1 * *` | The 1st of every month at midnight |

> Important: `Cron expressions in scheduled jobs are evaluated in Coordinated Universal Time (UTC).` — **cron is evaluated in UTC.** For "every day at 2am (JST)," write `0 17 * * *` in UTC (17:00 UTC the previous day). Timezone slippage is a classic incident, so be careful.

### Event: KEDA event-driven

Start when a message arrives in a queue. **1 event = 1 execution** is the basis.

```azurecli
az containerapp job create \
  --name queue-job --resource-group my-rg --environment my-env \
  --trigger-type "Event" \
  --replica-timeout 1800 \
  --image myregistry.azurecr.io/queue-job:2026-06-26-a1b2c3d \
  --cpu "0.5" --memory "1.0Gi" \
  --min-executions 0 --max-executions 10 \
  --scale-rule-name "queue" --scale-rule-type "azure-queue" \
  --scale-rule-metadata "accountName=mystorage" "queueName=myqueue" "queueLength=1" \
  --scale-rule-auth "connection=connection-string-secret" \
  --secrets "connection-string-secret=<QUEUE_CONNECTION_STRING>"
```

The difference in KEDA use between apps and jobs is detailed in the [scaling guide](/blog/azure-container-apps-keda-autoscaling-scale-to-zero-event-driven-guide), but the gist is this — **apps decide the "replica count" and jobs decide the "execution count"** with a scale rule. If "each event needs a new instance of dedicated resources / long-running processing," a job fits.

---

## Job settings: the four leads

| Setting | Property | Meaning |
|------|----------|-----|
| Max wait for completion | `replicaTimeout` | Max seconds to wait for replica completion. Cut off if exceeded |
| Retry limit | `replicaRetryLimit` | Number of retries for a failed replica. **`0` = no retry** |
| Parallelism | `parallelism` | Replicas per execution (often `1`) |
| Completion count | `replicaCompletionCount` | Completed replicas needed to count as success (≤ parallelism) |

> `The replicaTimeout setting takes precedence if it expires before all retries occur.` — **timeout takes precedence over retries.** Even with "3 retries," if the timeout comes first it's cut off. Take a timeout sufficiently longer than the expected processing time.

### Parallel batch (split processing)

To split a large amount of data and process it in parallel, raise `parallelism` and `replicaCompletionCount`.

```azurecli
az containerapp job create \
  --name batch-job --resource-group my-rg --environment my-env \
  --trigger-type "Schedule" --cron-expression "0 0 * * *" \
  --replica-timeout 1800 --replica-retry-limit 3 \
  --parallelism 5 --replica-completion-count 5 \
  --image myregistry.azurecr.io/batch:2026-06-26-a1b2c3d \
  --cpu "0.5" --memory "1.0Gi"
```

Each replica processes its assigned range (allocated via env vars or a queue), and the execution succeeds when all 5 succeed. If large-scale parallelism is needed, you can use it like an AWS Batch equivalent.

---

## Idempotency: make retries the normal case

A job **presupposes retries.** It retries via `replicaRetryLimit`, and for event-driven, the same message can be redelivered. So —

> ⚠️ **Build the job body to be idempotent.** Even processing the same input (message, date, ID) twice, the result counts as one.

If payments or billing are involved, record "whether processed" with an **idempotency key** (message ID, order ID) and skip the second time. This is the core of the design that achieved 0 double charges in production in a payments platform, and the [design of idempotent async processing](/blog/aws-sqs-lambda-eventbridge-idempotent-async-processing-guide) applies as-is. "Doesn't break on retry" = "you can treat retries as the normal case" = you can confidently enable auto-retry in production.

---

## CI Runner: the staple of event-driven jobs

A powerful use of event-driven jobs is a **self-hosted CI Runner.**

> A self-hosted GitHub Actions runner or Azure Pipelines agent that runs when a new job is queued in a workflow or pipeline. (— [Jobs](https://learn.microsoft.com/en-us/azure/container-apps/jobs))

When a job is queued in a workflow, KEDA detects it, starts one execution of the Runner container, and it disappears when done. **Scaling only when needed without holding a resident Runner** — a configuration excellent in both cost efficiency and security (disposable).

---

## Job constraints and monitoring

### Constraints: no Ingress, no Dapr

> The following features aren't supported: Dapr; Ingress and related features such as custom domains and SSL certificates. (— [Jobs](https://learn.microsoft.com/en-us/azure/container-apps/jobs))

A job **has no Ingress** (can't be hit externally) and **can't use Dapr.** Put HTTP-receiving processing in an app, and lean to an app if you need Dapr service invocation. Note that when a job calls another app at startup, `sidecar containers (such as the Envoy proxy) are guaranteed to be ready before the main job container begins execution` — **the Envoy sidecar is guaranteed ready before the main container starts**, so you don't need to add connection-failure retries for app-to-app calls at startup.

### Monitoring: execution history and logs

```azurecli
# 直近の実行ステータス
az containerapp job execution list --name my-job --resource-group my-rg
```

> `The execution history for scheduled and event-based jobs is limited to the most recent 100 successful and failed job executions.` — **the execution history is the most recent 100.** For auditing or detailed output beyond that, query the environment's **Log Analytics** ([observability design](/blog/opentelemetry-observability-production-tracing-metrics-logs)). If long-term retention/alerts are needed, build "notify when failed executions exceed a threshold" with Log Analytics + Azure Monitor alerts.

---

## Design checklist

- [ ] Correctly split "runs and finishes" processing into a **Job** and "resident" into an **App.**
- [ ] Choose the trigger: one-off = **Manual**, periodic = **Schedule (cron is UTC!)**, event = **Event.**
- [ ] **Build idempotently** (no double execution on retry/redelivery). Payments require an idempotency key.
- [ ] `replicaTimeout` sufficiently longer than the expected processing time. `replicaRetryLimit` per your retry policy (`0` disables it).
- [ ] The image tag is unique by **commit SHA** (no latest).
- [ ] Design on the premise that Ingress/Dapr can't be used. HTTP receiving and Dapr to apps.
- [ ] **Execution history is 100.** Auditing/alerts with Log Analytics + Azure Monitor.

---

## Summary

Container Apps Jobs is a feature that handles **finite-duration tasks** on the same platform as resident apps. The triggers are three: Manual / Schedule (cron, UTC) / Event (KEDA). The keys to production quality are — **correct trigger selection**, **UTC-aware cron**, and above all **idempotent design** (making retries the normal case). You can safely run batch, periodic processing, queue-driven, and CI Runners with the same vocabulary.

For designing and making idempotent batch, periodic, and event-driven jobs, [contact me](/contact). For production operations as a whole, see the [Azure Container Apps production-operations guide](/blog/azure-container-apps-production-guide).
