Retries & Timeouts

Add retries and timeouts to individual steps without wrapping your entire workflow in try/catch. For retries outside workflows (plain async/Result code), use tryAsyncRetry from awaitly/result/retry; see API Reference - Result retry.

Timeouts

Limit how long a step can run:

const data = await step.withTimeout(
  'slowOp',
  () => slowOperation(),
  { ms: 5000 }
);

With ID for debugging

The first argument is the step ID (used in events and timeout errors):

const data = await step.withTimeout(
  'slowOp',
  () => slowOperation(),
  { ms: 5000 }
);
/*
On timeout, error includes the name for debugging:
StepTimeoutError { name: 'Slow operation', ms: 5000 }
*/

Checking for timeout errors

import { isStepTimeoutError, getStepTimeoutMeta } from 'awaitly/workflow';

const result = await workflow.run(async ({ step, deps }) => {
  return await step.withTimeout('slowOp', () => deps.slowOperation(), { ms: 1000 });
});

if (!result.ok && isStepTimeoutError(result.error)) {
  const meta = getStepTimeoutMeta(result.error);
  console.log(`${meta.name} timed out after ${meta.ms}ms`);
}
/*
Output:
Slow op timed out after 1000ms
*/

Timeout behavior variants

Control what happens when a timeout occurs using onTimeout:

// Default behavior - return timeout error
const data = await step.withTimeout(
  'slowOp',
  () => slowOperation(),
  { ms: 5000, onTimeout: 'error' }
);
// Returns StepTimeoutError when operation times out

// Treat timeout as optional - return undefined instead of error
const data = await step.withTimeout(
  'optionalEnrichment',
  () => optionalEnrichment(),
  { ms: 1000, onTimeout: 'option' }
);
// data is undefined if timeout occurred, otherwise the result
// No error thrown - useful for non-critical operations

// Return error immediately but let operation complete in background
const data = await step.withTimeout(
  'expensiveOp',
  () => expensiveButRecoverableOperation(),
  { ms: 2000, onTimeout: 'disconnect' }
);
// Returns error immediately, but operation continues running
// Useful when you want to fail fast but still let work complete

// Use custom error from handler function
const data = await step.withTimeout(
  'apiCall',
  () => apiCall(),
  {
    ms: 5000,
    onTimeout: ({ name, ms }) => ({
      _tag: 'API_TIMEOUT' as const,
      operation: name,
      waited: ms,
    }),
  }
);
// Returns your custom error type for better domain modeling

Retries

Retry failed steps with configurable backoff:

const data = await step.retry(
  'fetchData',
  () => fetchData(),
  { attempts: 3 }
);

Backoff Strategies

awaitly supports three backoff strategies. Each has different characteristics for different scenarios.

Fixed Backoff

Same delay every time. Good for rate-limiting scenarios where you need consistent spacing.

{ attempts: 5, backoff: 'fixed', initialDelay: 100 }

Fixed Backoff (initialDelay: 100)
────────────────────────────
Attempt │ Delay  │ Visual
────────┼────────┼─────────────────────
   1    │ 100ms  │ ████
   2    │ 100ms  │ ████
   3    │ 100ms  │ ████
   4    │ 100ms  │ ████
   5    │ 100ms  │ ████

Total max wait: 400ms

Linear Backoff

Delay increases linearly. Balances retry speed with backoff pressure.

{ attempts: 5, backoff: 'linear', initialDelay: 100 }

Linear Backoff (initialDelay: 100)
─────────────────────────────
Attempt │ Delay  │ Visual
────────┼────────┼─────────────────────
   1    │ 100ms  │ ████
   2    │ 200ms  │ ████████
   3    │ 300ms  │ ████████████
   4    │ 400ms  │ ████████████████
   5    │ 500ms  │ ████████████████████

Total max wait: 1400ms

Exponential Backoff

Delay doubles each time. The standard for network calls. Reduces load on struggling services.

{ attempts: 5, backoff: 'exponential', initialDelay: 100 }

Exponential Backoff (initialDelay: 100)
──────────────────────────────────
Attempt │ Delay  │ Visual
────────┼────────┼─────────────────────────────────
   1    │ 100ms  │ ████
   2    │ 200ms  │ ████████
   3    │ 400ms  │ ████████████████
   4    │ 800ms  │ ████████████████████████████████
   5    │ 1600ms │ (capped by maxDelay if set)

Total max wait: 3000ms (without cap)

Delay Calculation Helper

Here’s how each strategy calculates delays:

// Helper to visualize backoff delays
const calculateDelay = (
  strategy: 'fixed' | 'linear' | 'exponential',
  attempt: number,
  initialDelay: number,
  maxDelay?: number
): number => {
  let delay: number;
  switch (strategy) {
    case 'fixed':
      delay = initialDelay;
      break;
    case 'linear':
      delay = initialDelay * attempt;
      break;
    case 'exponential':
      delay = initialDelay * Math.pow(2, attempt - 1);
      break;
  }
  return maxDelay ? Math.min(delay, maxDelay) : delay;
};

// Example usage
console.log(calculateDelay('exponential', 5, 100)); // 1600
console.log(calculateDelay('exponential', 5, 100, 1000)); // 1000 (capped)

Capping Delays

Use maxDelay to prevent delays from growing too large:

{
  attempts: 10,
  backoff: 'exponential',
  initialDelay: 100,
  maxDelay: 5000, // Never wait more than 5 seconds
}

Exponential with Cap (initialDelay: 100, maxDelay: 5000)
─────────────────────────────────────────────────────
Attempt │ Calculated │ Actual  │ Visual
────────┼────────────┼─────────┼──────────────────────
   1    │ 100ms      │ 100ms   │ ██
   2    │ 200ms      │ 200ms   │ ████
   3    │ 400ms      │ 400ms   │ ████████
   4    │ 800ms      │ 800ms   │ ████████████████
   5    │ 1600ms     │ 1600ms  │ ████████████████████
   6    │ 3200ms     │ 3200ms  │ ████████████████████
   7    │ 6400ms     │ 5000ms  │ ████████████████████ ← capped
   8    │ 12800ms    │ 5000ms  │ ████████████████████ ← capped

Adding Jitter

Randomize delays to avoid the “thundering herd” problem, when many clients retry simultaneously after a service recovers.

{
  attempts: 3,
  backoff: 'exponential',
  initialDelay: 100,
  jitter: true, // Adds random variation ±50%
}

Without Jitter (all clients)    With Jitter (clients spread out)
────────────────────────────    ────────────────────────────────
Time →                          Time →

100ms: ████████████████████     80ms:  ████
       (all clients retry)      95ms:  ████████
                                112ms: ██████
                                140ms: ████████████████
                                ↑
                                Load distributed!

Jitter calculation

With jitter: true, the actual delay is randomized within ±50% of the base delay:

// With initialDelay: 200 and jitter: true
// Possible delays: 100ms to 300ms (200 ± 50%)

// Internal calculation:
const baseDelay = 200;
const jitterRange = baseDelay * 0.5; // 100
const actualDelay = baseDelay - jitterRange + (Math.random() * jitterRange * 2);
// Results in: 100ms to 300ms

Conditional Retry

Only retry certain errors. Don’t retry permanent failures:

const user = await step.retry(
  'fetchUser',
  () => fetchUser('1'),
  {
    attempts: 3,
    backoff: 'exponential',
    shouldRetry: (error) => {
      // Don't retry NOT_FOUND - the user doesn't exist
      if (error === 'NOT_FOUND') return false;
      // Don't retry INVALID_ID - it will never work
      if (error === 'INVALID_ID') return false;
      // Retry everything else (network errors, timeouts, etc.)
      return true;
    },
  }
);

Common retry patterns

// Retry only network/server errors
shouldRetry: (error) => {
  const noRetry = ['NOT_FOUND', 'UNAUTHORIZED', 'INVALID_INPUT', 'DUPLICATE'];
  return !noRetry.includes(error);
}

// Retry only rate limits
shouldRetry: (error) => error === 'RATE_LIMITED'

// Retry HTTP 5xx only
shouldRetry: (error) => error.status >= 500

Combining Retry and Timeout

Each attempt has its own timeout:

const data = await step.retry(
  'fetchData',
  () => step.withTimeout('fetchData', () => fetchData(), { ms: 2000 }),
  { attempts: 3, backoff: 'exponential', initialDelay: 100 }
);

Timeline with Retry + Timeout
─────────────────────────────

├── Attempt 1 ────────────────────────► timeout at 2s
│   (wait 100ms)
│
├── Attempt 2 ────────────────────────► timeout at 2s
│   (wait 200ms)
│
├── Attempt 3 ────────────────────────► success or final failure

Total max time: 2s + 100ms + 2s + 200ms + 2s = 6.3s

Combining Strategies

awaitly’s retry options can be combined for sophisticated resilience patterns:

Exponential + Jitter + Cap
Retry + Timeout + Conditional

// Like AWS SDK default behavior
const data = await step.retry(
  'callExternalApi',
  () => callExternalApi(),
  {
    attempts: 5,
    backoff: 'exponential',
    initialDelay: 100,
    maxDelay: 5000,
    jitter: true,
  }
);

// Production-ready API call
const user = await step.retry(
  'fetchUser',
  () => step.withTimeout(
    'fetchUser',
    () => fetchUser('123'),
    { ms: 3000 }
  ),
  {
    attempts: 3,
    backoff: 'exponential',
    initialDelay: 200,
    maxDelay: 2000,
    jitter: true,
    shouldRetry: (error) => {
      // Don't retry NOT_FOUND or validation errors
      if (error === 'NOT_FOUND' || error === 'INVALID_ID') return false;
      return true;
    },
  }
);

Via Step Options

You can also configure retry and timeout directly in step options:

const user = await step('Fetch user', () => fetchUser('1'), {
  retry: {
    attempts: 3,
    backoff: 'exponential',
    initialDelay: 100,
    jitter: true,
  },
  timeout: {
    ms: 5000,
  },
});

Complete Example

import { createWorkflow } from 'awaitly/workflow';

const workflow = createWorkflow('workflow', { fetchUserFromApi, cacheUser });

const result = await workflow.run(async ({ step, deps }) => {
  // Retry API calls with production-ready settings
  const user = await step.retry(
    'fetchUser',
    () => step.withTimeout(
      'fetchUser',
      () => deps.fetchUserFromApi('123'),
      { ms: 3000 }
    ),
    {
      attempts: 3,
      backoff: 'exponential',
      initialDelay: 200,
      maxDelay: 2000,
      jitter: true,
      shouldRetry: (error) => error !== 'NOT_FOUND',
    }
  );

  // Cache doesn't need retry - it's local and fast
  await step('cacheUser', () => deps.cacheUser(user));

  return user;
});

if (!result.ok) {
  console.log('Failed after retries:', result.error);
}
/*
Output (success):
{ ok: true, value: User }

Output (failure after 3 attempts):
{ ok: false, error: 'NETWORK_ERROR' }

Output (immediate failure, no retries):
{ ok: false, error: 'NOT_FOUND' }
*/

Summary Table

Option	Type	Default	Description
`attempts`	`number`	required	Max retry attempts
`backoff`	`'fixed' \| 'linear' \| 'exponential'`	`'fixed'`	Delay growth strategy
`initialDelay`	`number`	`0`	Base delay in milliseconds
`maxDelay`	`number`	`undefined`	Maximum delay cap
`jitter`	`boolean`	`false`	Add random variation
`shouldRetry`	`(error) => boolean`	`() => true`	Condition for retry

Learn about Caching →