Testing External Infrastructure
Previously: Why This Pattern Exists. We covered unit tests with mocks and integration tests with real databases. Now let’s handle external services.
So far we’ve talked about testing your own code. You control the database, you control the cache, you control the business logic.
But what about the payment provider? The email service? The third-party API your app depends on?
You don’t control those. And that changes everything.
The Two Kinds of Infrastructure
Section titled “The Two Kinds of Infrastructure”There’s a fundamental split in how you test infrastructure:
- Infrastructure you control - Your database, your cache, your message queue
- Infrastructure you don’t control - Stripe, SendGrid, external APIs
For infrastructure you control, you can use real instances with test data. For infrastructure you don’t control, you need different strategies.
graph TD
A[Infrastructure] --> B[You Control]
A --> C[You Don't Control]
B --> D[Real Instance<br/>Test Database<br/>Test Cache]
B --> E[Stubs/Helpers<br/>createAuthorisedUser<br/>createCustomerWithOrders]
C --> F[Sandbox<br/>Stripe Test Mode<br/>SendGrid Test API]
C --> G[HTTP Mocking<br/>MSW<br/>Nock]
style A fill:#475569,stroke:#0f172a,stroke-width:2px,color:#fff
style B fill:#64748b,stroke:#0f172a,stroke-width:2px,color:#fff
style C fill:#64748b,stroke:#0f172a,stroke-width:2px,color:#fff
style D fill:#94a3b8,stroke:#0f172a,stroke-width:2px,color:#0f172a
style E fill:#94a3b8,stroke:#0f172a,stroke-width:2px,color:#0f172a
style F fill:#cbd5e1,stroke:#0f172a,stroke-width:2px,color:#0f172a
style G fill:#cbd5e1,stroke:#0f172a,stroke-width:2px,color:#0f172a
linkStyle 0 stroke:#0f172a,stroke-width:3px
linkStyle 1 stroke:#0f172a,stroke-width:3px
linkStyle 2 stroke:#0f172a,stroke-width:3px
linkStyle 3 stroke:#0f172a,stroke-width:3px
linkStyle 4 stroke:#0f172a,stroke-width:3px
linkStyle 5 stroke:#0f172a,stroke-width:3px
Testing Infrastructure You Control
Section titled “Testing Infrastructure You Control”For infrastructure your app owns (your database, your cache, your internal services), use real instances with test data.
Stubs for Test Data
Section titled “Stubs for Test Data”Create helper functions that set up realistic test data. These aren’t mocks: they’re real database records, real cache entries, created with Faker for variety.
Note: These are sometimes called “fixtures” or “factories” in other testing frameworks. The key point is they create real data, not fake behavior. We use “stubs” here to mean “test data helpers.”
// src/test-utils/stubs.ts
import { faker } from '@faker-js/faker';
import prisma from '../db';
/**
* Creates an authorised user for integration testing.
* Returns the user and a valid API key for authentication.
*/
export async function createAuthorisedUser(overrides: {
email?: string;
role?: 'user' | 'admin';
} = {}) {
const user = await prisma.user.create({
data: {
email: overrides.email ?? faker.internet.email(),
name: faker.person.fullName(),
role: overrides.role ?? 'user',
passwordHash: await hashPassword('test-password-123'),
},
});
const apiKey = await prisma.apiKey.create({
data: {
userId: user.id,
key: `test_${faker.string.alphanumeric(32)}`,
name: 'Test API Key',
enabled: true,
},
});
return {
user,
apiKey: apiKey.key, // Use this in Authorization header
};
}
/**
* Creates a customer with orders for testing order workflows.
*/
export async function createCustomerWithOrders(overrides: {
orderCount?: number;
orderStatus?: 'pending' | 'paid' | 'shipped';
} = {}) {
const customer = await prisma.customer.create({
data: {
email: faker.internet.email(),
name: faker.person.fullName(),
},
});
const orders = await Promise.all(
Array.from({ length: overrides.orderCount ?? 1 }).map(() =>
prisma.order.create({
data: {
customerId: customer.id,
status: overrides.orderStatus ?? 'pending',
total: faker.number.int({ min: 1000, max: 100000 }),
},
})
)
);
return { customer, orders };
}Now your integration tests read clearly:
// src/orders/create-order.test.int.ts
import { describe, it, expect } from 'vitest';
import { createAuthorisedUser, createCustomerWithOrders } from '../test-utils/stubs';
import { createOrder } from './create-order';
import prisma from '../db';
describe('createOrder (integration)', () => {
it('creates order and charges customer', async () => {
// Arrange: Real database, real test data
const { user, apiKey } = await createAuthorisedUser({ role: 'user' });
const { customer } = await createCustomerWithOrders({ orderCount: 0 });
// Act: Call business logic with real deps
const result = await createOrder(
{
customerId: customer.id,
items: [{ productId: 'prod-1', quantity: 2 }],
},
{ db: prisma, paymentProvider: paymentProvider }
);
// Assert: Verify it's actually in the database
expect(result.ok).toBe(true);
if (result.ok) {
const dbOrder = await prisma.order.findUnique({
where: { id: result.value.id },
});
expect(dbOrder).not.toBeNull();
}
});
});Why this works:
- Real infrastructure catches real bugs (constraints, transactions, query issues)
- Faker generates unique data, so tests don’t interfere
- No cleanup needed: each test creates its own data (as long as your test database is reset between runs or uses ephemeral containers)
- Tests can run in parallel
When to use stubs:
- Database operations
- Cache interactions
- Internal service calls
- Any infrastructure you deploy and control
Testing Infrastructure You Don’t Control
Section titled “Testing Infrastructure You Don’t Control”For external services (payment providers, email services, third-party APIs), you have three options, in order of preference:
Option 1: Sandboxes (Best When Available)
Section titled “Option 1: Sandboxes (Best When Available)”Many services provide test sandboxes. Stripe, SendGrid, and Twilio all have test environments where you can send real HTTP requests and get predetermined responses.
Worldpay Example (Magic Values):
Some sandboxes use “magic values” in specific request fields to trigger deterministic responses. Worldpay’s sandbox allows you to use special strings in the cardHolderName field to force specific outcomes. These “magic values” are specific to Worldpay’s sandbox and only apply in test environments.
// src/payments/worldpay-payment.test.int.ts
import { describe, it, expect } from 'vitest';
import { processWorldpayPayment } from './worldpay-payment';
describe('processWorldpayPayment (integration)', () => {
it('handles card expired error', async () => {
// Magic value in cardHolderName triggers specific error response
const result = await processWorldpayPayment(
{
amount: 1000,
currency: 'GBP',
cardNumber: '4444333322221111',
expiryMonth: '12',
expiryYear: '2030',
cardHolderName: 'REFUSED33', // Magic value → "CARD EXPIRED"
},
{
worldpay: new WorldpayClient(process.env.WORLDPAY_TEST_KEY!),
}
);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error).toBe('CARD_EXPIRED');
}
});
it('handles hold card error', async () => {
const result = await processWorldpayPayment(
{
amount: 1000,
cardHolderName: 'REFUSED4', // Magic value → "HOLD CARD"
// ... other fields
},
{ worldpay: new WorldpayClient(process.env.WORLDPAY_TEST_KEY!) }
);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error).toBe('HOLD_CARD');
}
});
it('succeeds with authorised magic value', async () => {
const result = await processWorldpayPayment(
{
amount: 1000,
cardHolderName: 'AUTHORISED', // Magic value → successful payment
// ... other fields
},
{ worldpay: new WorldpayClient(process.env.WORLDPAY_TEST_KEY!) }
);
expect(result.ok).toBe(true);
});
});Common Sandbox Patterns:
Different providers use different mechanisms, but the pattern is the same: use documented test inputs to get predictable outcomes.
| Provider | Deterministic test input | Example |
|---|---|---|
| Stripe | Payment method ID | pm_card_chargeDeclined |
| Worldpay | Cardholder name value | REFUSED33 (in cardHolderName field) |
| PayPal | Amount or field value | Specific amount triggers error |
| Adyen | Test card number | 4111111111111111 (declined) |
| GoCardless | Test bank account | Fixed test account number |
Why sandboxes are best:
- Real HTTP calls catch integration issues (headers, auth, request format)
- Predetermined responses let you test error paths reliably
- No mocking code to maintain
- Tests run against the actual API contract
- In CI, keep sandbox tests small (smoke-level) and push most error path testing to MSW/nock to avoid rate limits and slow builds
When to use sandboxes:
- Service provides a test environment
- You need to verify request format and authentication
- Error scenarios are well-documented
Limitations:
- Not all services have sandboxes
- Some sandboxes have rate limits
- Network calls are slower than mocks
Option 2: MSW (Mock Service Worker)
Section titled “Option 2: MSW (Mock Service Worker)”When sandboxes aren’t available, use MSW to intercept HTTP requests at the boundary of your app. MSW works in both Node.js and browser environments.
Setup:
// src/test-utils/msw-handlers.ts
import { http, HttpResponse } from 'msw';
import { setupServer } from 'msw/node';
export const handlers = [
// Single handler for Stripe charges with branching logic
// MSW matches handlers in order, so put more specific cases first
// Note: Example uses /charges for brevity; use the endpoints your app actually uses (e.g., PaymentIntents)
http.post('https://api.stripe.com/v1/charges', async ({ request }) => {
const body = await request.formData();
const paymentMethod = body.get('payment_method');
const amount = body.get('amount');
// Optional: Assert request details (headers, auth, etc.) to verify integration
// Adjust this check to match your actual auth format
const authHeader = request.headers.get('authorization');
if (!authHeader?.startsWith('Bearer sk_test_')) {
return HttpResponse.json(
{ error: { message: 'Invalid API key' } },
{ status: 401 }
);
}
// Simulate decline for specific test card
if (paymentMethod === 'pm_card_declined') {
return HttpResponse.json(
{
error: {
type: 'card_error',
code: 'card_declined',
message: 'Your card was declined.',
},
},
{ status: 402 }
);
}
// Default to success
return HttpResponse.json({
id: 'ch_test_123',
object: 'charge',
amount: Number(amount),
status: 'succeeded',
currency: 'usd',
});
}),
// Mock email service
http.post('https://api.sendgrid.com/v3/mail/send', () => {
return HttpResponse.json(
{ message: 'success' },
{ status: 202 }
);
}),
];
export const server = setupServer(...handlers);In your test file:
// src/payments/charge-customer.test.int.ts
import { describe, it, expect, beforeAll, afterEach, afterAll } from 'vitest';
import { server } from '../test-utils/msw-handlers';
import { createCharge } from './charge-customer';
describe('createCharge (integration)', () => {
beforeAll(() => {
server.listen({ onUnhandledRequest: 'error' });
});
afterEach(() => {
server.resetHandlers();
});
afterAll(() => {
server.close();
});
it('succeeds when payment method is valid', async () => {
// Note: Even though we pass a real-looking Stripe client,
// MSW intercepts the HTTP request before it reaches the network
const result = await createCharge(
{
amount: 1000,
currency: 'usd',
paymentMethodId: 'pm_card_visa',
},
{ stripe: new Stripe('sk_test_...') }
);
expect(result.ok).toBe(true);
if (result.ok) {
expect(result.value.status).toBe('succeeded');
}
});
it('handles declined card', async () => {
const result = await createCharge(
{
amount: 1000,
paymentMethodId: 'pm_card_declined', // Triggers decline handler
},
{ stripe: new Stripe('sk_test_...') }
);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error).toBe('PAYMENT_DECLINED');
}
});
});Why MSW:
- Intercepts at the boundary of your app: your code makes real HTTP requests
- Works in Node.js and browser (same handlers for both)
- Type-safe with TypeScript
- Can override handlers per test
- You can also assert request payloads/headers inside handlers (e.g., check auth header) to lock down the integration
When to use MSW:
- Service doesn’t have a sandbox
- You need to test error scenarios that are hard to trigger in sandboxes
- You want fast, reliable tests without network calls
Reference: See nodejs-api-testing-examples-vitest for complete MSW examples.
Option 3: Nock
Section titled “Option 3: Nock”Nock is a Node.js-specific HTTP mocking library. It intercepts requests at a lower level than MSW.
Example:
// src/payments/charge-customer.test.int.ts
import { describe, it, expect, afterEach } from 'vitest';
import nock from 'nock';
import { createCharge } from './charge-customer';
describe('createCharge (integration)', () => {
afterEach(() => {
nock.cleanAll();
});
it('succeeds when payment method is valid', async () => {
nock('https://api.stripe.com')
.post('/v1/charges')
.reply(200, {
id: 'ch_test_123',
status: 'succeeded',
amount: 1000,
});
const result = await createCharge(
{
amount: 1000,
paymentMethodId: 'pm_card_visa',
},
{ stripe: new Stripe('sk_test_...') }
);
expect(result.ok).toBe(true);
});
it('handles declined card', async () => {
nock('https://api.stripe.com')
.post('/v1/charges')
.reply(402, {
error: {
type: 'card_error',
code: 'card_declined',
},
});
const result = await createCharge(
{
amount: 1000,
paymentMethodId: 'pm_card_declined',
},
{ stripe: new Stripe('sk_test_...') }
);
expect(result.ok).toBe(false);
});
});Nock vs MSW:
| Feature | Nock | MSW |
|---|---|---|
| Environment | Node.js only | Node.js + Browser |
| Interception level | Lower (http/https modules) | Higher (fetch/XHR) |
| Setup complexity | Simple | Slightly more setup |
| Type safety | Manual | Better with TypeScript |
When to use Nock:
- Node.js-only codebase
- You prefer the simpler API
- You’re already using it in your codebase
When to prefer MSW:
- You need browser testing too
- You want better TypeScript support
- You’re starting fresh
Reference: See nodejs-api-testing-examples-vitest for complete nock examples.
Testing Error Scenarios: Rate Limits, Retries, and Timeouts
Section titled “Testing Error Scenarios: Rate Limits, Retries, and Timeouts”Integration tests with MSW or Nock let you test error scenarios that are difficult to trigger with real services. This is especially valuable for testing resilience patterns like retries, backoffs, and timeout handling.
Testing Rate Limits (429 Errors)
Section titled “Testing Rate Limits (429 Errors)”You can simulate rate limiting to verify your retry logic and user experience:
// src/payments/charge-customer.test.int.ts
import { describe, it, expect, vi } from 'vitest';
import { server } from '../test-utils/msw-handlers';
import { createCharge } from './charge-customer';
describe('createCharge handles rate limits', () => {
it('retries after 429 with Retry-After header', async () => {
let attemptCount = 0;
server.use(
http.post('https://api.stripe.com/v1/charges', async ({ request }) => {
attemptCount++;
if (attemptCount === 1) {
// First attempt: rate limited
return HttpResponse.json(
{ error: { message: 'Rate limit exceeded' } },
{
status: 429,
headers: { 'Retry-After': '1' },
}
);
}
// Second attempt: succeeds
return HttpResponse.json({
id: 'ch_test_123',
status: 'succeeded',
});
})
);
const result = await createCharge(
{ amount: 1000, paymentMethodId: 'pm_card_visa' },
{ stripe: new Stripe('sk_test_...') }
);
expect(result.ok).toBe(true);
expect(attemptCount).toBe(2); // Verify retry happened
});
});Testing Timeouts
Section titled “Testing Timeouts”Simulate slow or unresponsive services to verify timeout handling:
it('handles timeout gracefully', async () => {
server.use(
http.post('https://api.stripe.com/v1/charges', async () => {
// Simulate slow response that exceeds timeout
await new Promise((resolve) => setTimeout(resolve, 5000));
return HttpResponse.json({ id: 'ch_test_123' });
})
);
const result = await createCharge(
{ amount: 1000, paymentMethodId: 'pm_card_visa' },
{
stripe: new Stripe('sk_test_...'),
timeout: 1000, // 1 second timeout
}
);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error).toBe('TIMEOUT');
}
});Testing Retry and Backoff Logic
Section titled “Testing Retry and Backoff Logic”Verify that your retry logic with exponential backoff works correctly:
it('implements exponential backoff on retries', async () => {
const attemptTimes: number[] = [];
let attemptCount = 0;
server.use(
http.post('https://api.stripe.com/v1/charges', async () => {
attemptCount++;
attemptTimes.push(Date.now());
if (attemptCount < 3) {
return HttpResponse.json(
{ error: { message: 'Temporary failure' } },
{ status: 503 }
);
}
return HttpResponse.json({ id: 'ch_test_123', status: 'succeeded' });
})
);
const result = await createCharge(
{ amount: 1000, paymentMethodId: 'pm_card_visa' },
{ stripe: new Stripe('sk_test_...') }
);
expect(result.ok).toBe(true);
expect(attemptCount).toBe(3);
// Verify backoff: time between attempts should increase
const delay1 = attemptTimes[1] - attemptTimes[0];
const delay2 = attemptTimes[2] - attemptTimes[1];
expect(delay2).toBeGreaterThan(delay1);
});Testing UX Impact in Frontend
Section titled “Testing UX Impact in Frontend”For frontend applications, you can test how errors affect the user experience:
// src/components/PaymentForm.test.tsx
import { describe, it, expect } from 'vitest';
import { render, screen, waitFor } from '@testing-library/react';
import { server } from '../test-utils/msw-handlers';
import { PaymentForm } from './PaymentForm';
describe('PaymentForm error handling', () => {
it('shows user-friendly message on rate limit', async () => {
server.use(
http.post('https://api.stripe.com/v1/charges', () => {
return HttpResponse.json(
{ error: { message: 'Rate limit exceeded' } },
{ status: 429 }
);
})
);
render(<PaymentForm />);
// ... trigger payment
await waitFor(() => {
expect(screen.getByText(/please try again in a moment/i)).toBeInTheDocument();
});
});
it('shows loading state during retry', async () => {
let attemptCount = 0;
server.use(
http.post('https://api.stripe.com/v1/charges', async () => {
attemptCount++;
if (attemptCount === 1) {
await new Promise((resolve) => setTimeout(resolve, 100));
return HttpResponse.json({ error: { message: 'Temporary failure' } }, { status: 503 });
}
return HttpResponse.json({ id: 'ch_test_123', status: 'succeeded' });
})
);
render(<PaymentForm />);
// ... trigger payment
// Verify loading indicator appears during retry
expect(screen.getByText(/processing/i)).toBeInTheDocument();
});
});Testing Cascading Failures
Section titled “Testing Cascading Failures”Verify how errors in one service affect dependent services:
it('handles payment service failure gracefully', async () => {
server.use(
http.post('https://api.stripe.com/v1/charges', () => {
return HttpResponse.json(
{ error: { message: 'Service unavailable' } },
{ status: 503 }
);
})
);
const result = await processOrder(
{
customerId: 'cust_123',
items: [{ productId: 'prod_1', quantity: 1 }],
},
{ db: prisma, stripe: new Stripe('sk_test_...') }
);
expect(result.ok).toBe(false);
if (!result.ok) {
expect(result.error).toBe('PAYMENT_SERVICE_UNAVAILABLE');
}
// Verify order was not created in database
const order = await prisma.order.findUnique({
where: { customerId: 'cust_123' },
});
expect(order).toBeNull();
});Why this matters:
- Reliability: Verify your retry and backoff logic works before production
- User experience: Test error messages and loading states in frontend
- Resilience: Ensure cascading failures are handled gracefully
- Performance: Test timeout handling without waiting for real network delays
These scenarios are nearly impossible to test reliably with real services, but integration tests with MSW or Nock make them straightforward.
Why Integration Tests, Not E2E
Section titled “Why Integration Tests, Not E2E”These are integration tests, not end-to-end (E2E) tests. Here’s the distinction:
Integration tests:
- Test integration with external services
- Use sandboxes or mocks (MSW/nock)
- Fast, reliable, no rate limits
- Cover many scenarios (success, errors, edge cases)
E2E tests:
- Test the entire system in production-like environment
- Make real calls to production services
- Slow, flaky, subject to rate limits
- Cover happy paths only
graph LR
A[Test Types] --> B[Unit Tests<br/>Mock everything<br/>Fast ms]
A --> C[Integration Tests<br/>Sandbox or MSW/Nock<br/>Fast seconds]
A --> D[E2E Tests<br/>Real production services<br/>Slow minutes]
style A fill:#475569,stroke:#0f172a,stroke-width:2px,color:#fff
style B fill:#64748b,stroke:#0f172a,stroke-width:2px,color:#fff
style C fill:#94a3b8,stroke:#0f172a,stroke-width:2px,color:#0f172a
style D fill:#cbd5e1,stroke:#0f172a,stroke-width:2px,color:#0f172a
linkStyle 0 stroke:#0f172a,stroke-width:3px
linkStyle 1 stroke:#0f172a,stroke-width:3px
linkStyle 2 stroke:#0f172a,stroke-width:3px
Why integration tests win:
- Coverage: You can test error scenarios that are hard to trigger in production
- Speed: No network latency, no rate limits
- Reliability: Tests don’t fail because a third-party service is down
- Cost: No charges for test API calls
When you still need E2E:
- Verify production configuration works
- Test against actual service behavior (not documented)
- Final smoke test before deployment
But for day-to-day development? Integration tests with sandboxes or MSW/nock give you 90% of the value with 10% of the pain.
Decision Matrix
Section titled “Decision Matrix”| Scenario | Solution | Why |
|---|---|---|
| Service has sandbox (Stripe, SendGrid) | Use sandbox | Real HTTP, predetermined responses, no mocking code |
| Service has no sandbox | MSW or Nock | Fast, reliable, covers error scenarios |
| Need browser + Node.js testing | MSW | Works in both environments |
| Node.js only | Nock or MSW | Either works, MSW has better TypeScript |
| Testing your own database/cache | Real instance + stubs | Catches real bugs, uses Faker for data |
Rule of thumb:
- Sandbox first - If the service provides one, use it
- MSW second - If no sandbox, MSW is more versatile
- Nock third - Only if you’re already using it or need Node-only
The Complete Pattern
Section titled “The Complete Pattern”Putting it all together:
// src/payments/process-payment.test.int.ts
import { describe, it, expect, beforeAll, afterEach, afterAll } from 'vitest';
import { createAuthorisedUser } from '../test-utils/stubs';
import { server } from '../test-utils/msw-handlers';
import { processPayment } from './process-payment';
import prisma from '../db';
describe('processPayment (integration)', () => {
beforeAll(() => {
server.listen({ onUnhandledRequest: 'error' });
});
afterEach(() => {
server.resetHandlers();
});
afterAll(() => {
server.close();
});
it('charges customer and creates order', async () => {
// Arrange: Real database with test data
const { user } = await createAuthorisedUser();
const customer = await prisma.customer.create({
data: {
userId: user.id,
email: user.email,
stripeCustomerId: 'cus_test_123',
},
});
// Act: Business logic with real DB; Stripe client wired, HTTP intercepted by MSW
const result = await processPayment(
{
customerId: customer.id,
amount: 1000,
paymentMethodId: 'pm_card_visa',
},
{
db: prisma,
stripe: new Stripe(process.env.STRIPE_TEST_SECRET_KEY!),
}
);
// Assert: Verify database state
expect(result.ok).toBe(true);
if (result.ok) {
const order = await prisma.order.findUnique({
where: { id: result.value.orderId },
});
expect(order).not.toBeNull();
expect(order?.status).toBe('paid');
}
});
});What’s happening:
- Real database - Uses
createAuthorisedUser()stub to set up test data - Stripe client wired, HTTP intercepted by MSW - Real Stripe client instance, but MSW intercepts HTTP requests before they reach the network
- Integration test - Tests the full flow: database → business logic → external API
This gives you confidence that your code works end-to-end, without the flakiness and cost of real external services.
The Rules
Section titled “The Rules”- Use stubs for infrastructure you control - Real instances with test data helpers
- Prefer sandboxes when available - Real HTTP calls, predetermined responses
- Use MSW or Nock when no sandbox - Fast, reliable, covers error scenarios
- These are integration tests - Not E2E, but they cover most scenarios
- Test error paths - Sandboxes and mocks let you test failures reliably
What’s Next
Section titled “What’s Next”We’ve covered testing your own code and testing external services. But there’s one more piece: how do you structure your code so it’s easy to test in the first place?
That’s where the fn(args, deps) pattern comes in.
Next: Functions Over Classes. Learn how explicit dependency injection makes everything testable.