"I want to generate a report overnight," "I want to handle processing piled up in a queue," "I want to run my own CI Runner" — processing that runs and finishes rather than a resident service is Azure Container Apps Jobs' domain. Though handled on the same platform as resident apps, a design mistake causes double execution, dropped work, and infinite retries.
This article explains the production design of Container Apps Jobs, faithful to the Microsoft Learn Jobs documentation. I've run SQS-driven idempotent batches in production on AWS. The principle "make retries the normal case = build idempotently" is the same on Azure. For ACA as a whole, see the Azure Container Apps production-operations guide.
What a job is: a task that runs and finishes
Azure Container Apps jobs enable you to run containerized tasks that run for a finite duration and then stop. (— Jobs in Azure Container Apps)
Apps (resident services) and jobs run in the same environment and share network and logs. The difference is "whether it finishes."
| App | Job | |
|---|---|---|
| Nature | Runs continuously | Runs for a finite duration and stops |
| On failure | Auto-restarts if the container drops | A non-zero exit is a failure. Retry configurable |
| Example | HTTP API, web app, resident worker | Nightly batch, data migration, processing one queue item |
App vs job (official examples)
| What you want to do | Choose |
|---|---|
| An HTTP server returning web content/API | App (HTTP scale rule) |
| Generate a report every night | Job (Schedule trigger + cron) |
| Continuously process a Service Bus queue | App (custom scale rule) |
| Process one item / a small batch from a queue and stop | Job (Event trigger) |
| Background processing that starts on demand and finishes | Job (Manual trigger) |
| A self-hosted CI Runner | Job (Event trigger) |
The three triggers
A job's trigger type determines how the job is started. (— Jobs)
- Manual: on demand (CLI, portal, ARM API).
- Schedule: periodic with a cron expression.
- Event: triggered by an event via a KEDA scaler.
Manual: on-demand execution
For one-off processing like data migration. Create it and start it when needed.
az containerapp job create \
--name migrate-job --resource-group my-rg --environment my-env \
--trigger-type "Manual" \
--replica-timeout 1800 --replica-retry-limit 0 \
--replica-completion-count 1 --parallelism 1 \
--image myregistry.azurecr.io/migrate:2026-06-26-a1b2c3d \
--cpu "0.5" --memory "1.0Gi"
# 起動(設定の上書きも可能)
az containerapp job start --name migrate-job --resource-group my-rg
You can override settings at start time (run the same job on different input by changing env vars or the start command). But
When you override a configuration, the job's entire template configuration is replaced— the entire template configuration is replaced, so include all needed settings.
Schedule: cron (UTC)
Like report generation every midnight.
az containerapp job create \
--name nightly-report --resource-group my-rg --environment my-env \
--trigger-type "Schedule" --cron-expression "0 0 * * *" \
--replica-timeout 1800 --replica-retry-limit 1 \
--replica-completion-count 1 --parallelism 1 \
--image myregistry.azurecr.io/report:2026-06-26-a1b2c3d \
--cpu "0.5" --memory "1.0Gi"
Cron-expression examples (standard 5 fields):
| Expression | Meaning |
|---|---|
*/5 * * * * | Every 5 minutes |
0 */2 * * * | Every 2 hours |
0 0 * * * | Daily at midnight |
0 0 * * 0 | Every Sunday at midnight |
0 0 1 * * | The 1st of every month at midnight |
Important:
Cron expressions in scheduled jobs are evaluated in Coordinated Universal Time (UTC).— cron is evaluated in UTC. For "every day at 2am (JST)," write0 17 * * *in UTC (17:00 UTC the previous day). Timezone slippage is a classic incident, so be careful.
Event: KEDA event-driven
Start when a message arrives in a queue. 1 event = 1 execution is the basis.
az containerapp job create \
--name queue-job --resource-group my-rg --environment my-env \
--trigger-type "Event" \
--replica-timeout 1800 \
--image myregistry.azurecr.io/queue-job:2026-06-26-a1b2c3d \
--cpu "0.5" --memory "1.0Gi" \
--min-executions 0 --max-executions 10 \
--scale-rule-name "queue" --scale-rule-type "azure-queue" \
--scale-rule-metadata "accountName=mystorage" "queueName=myqueue" "queueLength=1" \
--scale-rule-auth "connection=connection-string-secret" \
--secrets "connection-string-secret=<QUEUE_CONNECTION_STRING>"
The difference in KEDA use between apps and jobs is detailed in the scaling guide, but the gist is this — apps decide the "replica count" and jobs decide the "execution count" with a scale rule. If "each event needs a new instance of dedicated resources / long-running processing," a job fits.
Job settings: the four leads
| Setting | Property | Meaning |
|---|---|---|
| Max wait for completion | replicaTimeout | Max seconds to wait for replica completion. Cut off if exceeded |
| Retry limit | replicaRetryLimit | Number of retries for a failed replica. 0 = no retry |
| Parallelism | parallelism | Replicas per execution (often 1) |
| Completion count | replicaCompletionCount | Completed replicas needed to count as success (≤ parallelism) |
The replicaTimeout setting takes precedence if it expires before all retries occur.— timeout takes precedence over retries. Even with "3 retries," if the timeout comes first it's cut off. Take a timeout sufficiently longer than the expected processing time.
Parallel batch (split processing)
To split a large amount of data and process it in parallel, raise parallelism and replicaCompletionCount.
az containerapp job create \
--name batch-job --resource-group my-rg --environment my-env \
--trigger-type "Schedule" --cron-expression "0 0 * * *" \
--replica-timeout 1800 --replica-retry-limit 3 \
--parallelism 5 --replica-completion-count 5 \
--image myregistry.azurecr.io/batch:2026-06-26-a1b2c3d \
--cpu "0.5" --memory "1.0Gi"
Each replica processes its assigned range (allocated via env vars or a queue), and the execution succeeds when all 5 succeed. If large-scale parallelism is needed, you can use it like an AWS Batch equivalent.
Idempotency: make retries the normal case
A job presupposes retries. It retries via replicaRetryLimit, and for event-driven, the same message can be redelivered. So —
⚠️ Build the job body to be idempotent. Even processing the same input (message, date, ID) twice, the result counts as one.
If payments or billing are involved, record "whether processed" with an idempotency key (message ID, order ID) and skip the second time. This is the core of the design that achieved 0 double charges in production in a payments platform, and the design of idempotent async processing applies as-is. "Doesn't break on retry" = "you can treat retries as the normal case" = you can confidently enable auto-retry in production.
CI Runner: the staple of event-driven jobs
A powerful use of event-driven jobs is a self-hosted CI Runner.
A self-hosted GitHub Actions runner or Azure Pipelines agent that runs when a new job is queued in a workflow or pipeline. (— Jobs)
When a job is queued in a workflow, KEDA detects it, starts one execution of the Runner container, and it disappears when done. Scaling only when needed without holding a resident Runner — a configuration excellent in both cost efficiency and security (disposable).
Job constraints and monitoring
Constraints: no Ingress, no Dapr
The following features aren't supported: Dapr; Ingress and related features such as custom domains and SSL certificates. (— Jobs)
A job has no Ingress (can't be hit externally) and can't use Dapr. Put HTTP-receiving processing in an app, and lean to an app if you need Dapr service invocation. Note that when a job calls another app at startup, sidecar containers (such as the Envoy proxy) are guaranteed to be ready before the main job container begins execution — the Envoy sidecar is guaranteed ready before the main container starts, so you don't need to add connection-failure retries for app-to-app calls at startup.
Monitoring: execution history and logs
# 直近の実行ステータス
az containerapp job execution list --name my-job --resource-group my-rg
The execution history for scheduled and event-based jobs is limited to the most recent 100 successful and failed job executions.— the execution history is the most recent 100. For auditing or detailed output beyond that, query the environment's Log Analytics (observability design). If long-term retention/alerts are needed, build "notify when failed executions exceed a threshold" with Log Analytics + Azure Monitor alerts.
Design checklist
- Correctly split "runs and finishes" processing into a Job and "resident" into an App.
- Choose the trigger: one-off = Manual, periodic = Schedule (cron is UTC!), event = Event.
- Build idempotently (no double execution on retry/redelivery). Payments require an idempotency key.
-
replicaTimeoutsufficiently longer than the expected processing time.replicaRetryLimitper your retry policy (0disables it). - The image tag is unique by commit SHA (no latest).
- Design on the premise that Ingress/Dapr can't be used. HTTP receiving and Dapr to apps.
- Execution history is 100. Auditing/alerts with Log Analytics + Azure Monitor.
Summary
Container Apps Jobs is a feature that handles finite-duration tasks on the same platform as resident apps. The triggers are three: Manual / Schedule (cron, UTC) / Event (KEDA). The keys to production quality are — correct trigger selection, UTC-aware cron, and above all idempotent design (making retries the normal case). You can safely run batch, periodic processing, queue-driven, and CI Runners with the same vocabulary.
For designing and making idempotent batch, periodic, and event-driven jobs, contact me. For production operations as a whole, see the Azure Container Apps production-operations guide.