Fixed Wall-Time Retries
Set ScheduleToCloseTimeout on the Activity call to enforce a hard time budget across all retry attempts. Use this when a business SLA requires the Activity to succeed or fail within a defined window, regardless of how many individual attempts occur.
Overview
The Fixed Wall-Time Retries pattern enforces a maximum total elapsed time across all Activity retry attempts using ScheduleToCloseTimeout.
Use it when a business process must succeed or fail within a defined time budget, regardless of how many individual attempts occur.
Problem
StartToCloseTimeout limits how long a single Activity attempt may run before Temporal cancels it and schedules a retry.
It does not limit how long retries collectively may run.
A process with StartToCloseTimeout=5m and the default unlimited retry policy can run for days — each attempt times out at 5 minutes, then Temporal waits for the backoff delay and tries again, indefinitely.
When a business SLA exists and violating that SLA is a failure such as a payment must charge in two minutes or less, an authorization check must complete within 30 seconds — you need a hard outer boundary that Temporal enforces automatically without requiring the Workflow to track elapsed time itself.
Solution
Set ScheduleToCloseTimeout on the Activity call options.
It starts when the Activity is first scheduled and expires when the clock runs out, regardless of how many attempts have occurred.
If the timeout expires during an attempt, that attempt is cancelled.
If it expires between retries, the pending retry is abandoned and Temporal delivers an ActivityError to the Workflow.
The following describes each step:
- The two minute budget clock starts the moment the Workflow schedules the Activity.
- Each attempt runs up to 30 seconds (
StartToCloseTimeout). On failure, Temporal waits the backoff delay and retries. - Retries continue until either the Activity succeeds or the two minute budget is exhausted.
- When the budget expires, Temporal delivers an
ActivityErrorto the Workflow, which can log, alert, or compensate.
Implementation
Enforcing a 2-minute SLA
Set both schedule_to_close_timeout (the total budget) and start_to_close_timeout (the per-attempt cap).
The retry policy controls the interval between attempts.
Temporal stops retrying automatically when the budget runs out.
- Python
- Go
- Java
- TypeScript
# workflows.py
from datetime import timedelta
from temporalio import workflow
from temporalio.common import RetryPolicy
from temporalio.exceptions import ActivityError, TimeoutError, TimeoutType
import activities
@workflow.defn
class PaymentAuthWorkflow:
@workflow.run
async def run(self, transaction_id: str) -> str:
try:
return await workflow.execute_activity(
activities.authorize_transaction,
transaction_id,
schedule_to_close_timeout=timedelta(minutes=2), # total budget
start_to_close_timeout=timedelta(seconds=30), # per attempt
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=5),
backoff_coefficient=1.5,
maximum_interval=timedelta(seconds=30),
),
)
except ActivityError as e:
cause = e.__cause__
if isinstance(cause, TimeoutError) and cause.type == TimeoutType.SCHEDULE_TO_CLOSE:
workflow.logger.error(
"Authorization failed — 2-minute SLA breached",
extra={"transaction_id": transaction_id},
)
raise
// workflow.go
package shipment
import (
"errors"
"time"
enumspb "go.temporal.io/api/enums/v1"
"go.temporal.io/sdk/temporal"
"go.temporal.io/sdk/workflow"
)
func PaymentAuthWorkflow(ctx workflow.Context, transactionID string) (string, error) {
ao := workflow.ActivityOptions{
ScheduleToCloseTimeout: 2 * time.Minute, // total budget
StartToCloseTimeout: 30 * time.Second, // per attempt
RetryPolicy: &temporal.RetryPolicy{
InitialInterval: 5 * time.Second,
BackoffCoefficient: 1.5,
MaximumInterval: 30 * time.Second,
},
}
ctx = workflow.WithActivityOptions(ctx, ao)
var result string
err := workflow.ExecuteActivity(ctx, AuthorizeTransaction, transactionID).Get(ctx, &result)
if err != nil {
var timeoutErr *temporal.TimeoutError
if errors.As(err, &timeoutErr) && timeoutErr.TimeoutType() == enumspb.TIMEOUT_TYPE_SCHEDULE_TO_CLOSE {
workflow.GetLogger(ctx).Error(
"Authorization failed — 2-minute SLA breached",
"transactionID", transactionID,
)
}
return "", err
}
return result, nil
}
// ShipmentNotificationWorkflowImpl.java
import io.temporal.activity.ActivityOptions;
import io.temporal.api.enums.v1.TimeoutType;
import io.temporal.common.RetryOptions;
import io.temporal.failure.ActivityFailure;
import io.temporal.failure.TimeoutFailure;
import io.temporal.workflow.Workflow;
import java.time.Duration;
public class PaymentAuthWorkflowImpl implements PaymentAuthWorkflow {
private final PaymentActivities activities = Workflow.newActivityStub(
PaymentActivities.class,
ActivityOptions.newBuilder()
.setScheduleToCloseTimeout(Duration.ofMinutes(2)) // total budget
.setStartToCloseTimeout(Duration.ofSeconds(30)) // per attempt
.setRetryOptions(RetryOptions.newBuilder()
.setInitialInterval(Duration.ofSeconds(5))
.setBackoffCoefficient(1.5)
.setMaximumInterval(Duration.ofSeconds(30))
.build())
.build()
);
@Override
public String run(String transactionId) {
try {
return activities.authorizeTransaction(transactionId);
} catch (ActivityFailure e) {
if (e.getCause() instanceof TimeoutFailure tf
&& tf.getTimeoutType() == TimeoutType.TIMEOUT_TYPE_SCHEDULE_TO_CLOSE) {
Workflow.getLogger(getClass()).error(
"Authorization failed — 2-minute SLA breached: " + transactionId, e
);
}
throw e;
}
}
}
// workflows.ts
import * as wf from '@temporalio/workflow';
import type * as activities from './activities';
const { authorizeTransaction } = wf.proxyActivities<typeof activities>({
scheduleToCloseTimeout: '2m', // total budget
startToCloseTimeout: '30s', // per attempt
retry: {
initialInterval: '5s',
backoffCoefficient: 1.5,
maximumInterval: '30s',
},
});
export async function paymentAuthWorkflow(transactionId: string): Promise<string> {
try {
return await authorizeTransaction(transactionId);
} catch (err) {
if (err instanceof wf.ActivityFailure) {
const cause = err.cause;
if (cause instanceof wf.TimeoutFailure && cause.type === wf.TimeoutType.SCHEDULE_TO_CLOSE) {
wf.log.error('Authorization failed — 2-minute SLA breached', { transactionId });
}
}
throw err;
}
}
Short SLA without a per-attempt timeout
For tighter budgets — such as a 30 second authorization window — you may omit StartToCloseTimeout and let ScheduleToCloseTimeout act as the only bound.
Temporal requires at least one timeout to be set; ScheduleToCloseTimeout alone satisfies that requirement.
- Python
- Go
- Java
- TypeScript
# workflows.py
result = await workflow.execute_activity(
activities.authorize_transaction,
transaction_id,
schedule_to_close_timeout=timedelta(seconds=30),
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=3),
backoff_coefficient=1.5,
),
)
// workflow.go
ao := workflow.ActivityOptions{
ScheduleToCloseTimeout: 30 * time.Second,
RetryPolicy: &temporal.RetryPolicy{
InitialInterval: 3 * time.Second,
BackoffCoefficient: 1.5,
},
}
// Workflow.java
ActivityOptions.newBuilder()
.setScheduleToCloseTimeout(Duration.ofSeconds(30))
.setRetryOptions(RetryOptions.newBuilder()
.setInitialInterval(Duration.ofSeconds(3))
.setBackoffCoefficient(1.5)
.build())
.build()
// workflows.ts
const { authorizeTransaction } = wf.proxyActivities<typeof activities>({
scheduleToCloseTimeout: '30s',
retry: {
initialInterval: '3s',
backoffCoefficient: 1.5,
},
});
Best practices
- Set both timeouts for clarity. Use
ScheduleToCloseTimeoutas the total SLA andStartToCloseTimeoutas a per-attempt safety valve. OmittingStartToCloseTimeoutmeans a single slow response can consume the entire budget. - Cap
MaximumIntervalwell below the SLA. IfMaximumIntervalis 2 hours and the SLA is 24 hours, only 12 retries are possible. Tune the interval so the backoff plateaus at a value that allows meaningful retries within the budget. - Handle
ActivityErrorexplicitly. When the SLA expires, Temporal delivers an error to the Workflow. Catch it to send an alert, trigger a compensation, or record a breach in an audit log. - Distinguish SLA breaches from transient errors. Inspect the error cause — check that the
ActivityError's cause is aTimeoutErrorwithTimeoutType.SCHEDULE_TO_CLOSE(Python) or aTimeoutFailurewithTimeoutType.SCHEDULE_TO_CLOSE(TypeScript) orTIMEOUT_TYPE_SCHEDULE_TO_CLOSE(Go/Java) to separate an SLA breach from an application failure. This lets you log or alert specifically on SLA violations rather than treating all activity errors the same way.
Common pitfalls
- Not accounting for
ScheduleToStartdelay in the budget.ScheduleToCloseTimeoutbegins when the Activity is first scheduled, which includes the time the task waits in the queue before a Worker picks it up. Under high load or insufficient Worker capacity, tasks can sit in the queue for seconds or minutes before the first attempt starts — consuming SLA budget before any work is done. Provision Workers with enough capacity for peak traffic, or use autoscaling, to keepScheduleToStartlatency negligible relative to the SLA window. - Using
StartToCloseTimeoutalone for SLA enforcement. A downstream system that responds slowly but never fully times out can keep resetting the per-attempt clock indefinitely. - Setting
ScheduleToCloseTimeoutshorter thanStartToCloseTimeout. If the total budget is shorter than a single attempt's maximum, the Activity will never complete — Temporal will cancel it before it finishes. - Ignoring the breach in the Workflow. Letting the
ActivityErrorpropagate without handling it means SLA breaches go unlogged and uncompensated. - Not accounting for backoff delays in the budget. The total time includes both attempt durations and the backoff delays between them. A 1-hour budget with a 30-minute initial interval and coefficient 2.0 leaves room for only one or two attempts.
Related patterns
- Fixed Count of Retries: Bound by attempt count rather than elapsed time.
- Delayed Retry: Fixed-interval retry when the downstream unavailability window is known.
- Error Handling & Retry Patterns: Overview and decision tree for all retry patterns.