Circuit Breaker Pattern with Fallback Strategies

Circuit Breaker Pattern with Fallback Strategies

Introduction

The Circuit Breaker pattern prevents cascading failures in distributed systems by detecting failures and encapsulating logic to prevent repeated calls to failing services. When combined with intelligent fallback strategies, it enables graceful degradation rather than total system failure.

This pattern is essential for building resilient microservices architectures, especially when dealing with unreliable external dependencies like third-party APIs, databases, or downstream services.

Core Concept

A circuit breaker acts like an electrical circuit breaker: it monitors for failures and “trips” when failure rates exceed a threshold, preventing further calls to the failing service. The circuit breaker operates in three states:

  1. Closed: Normal operation, requests pass through
  2. Open: Failure threshold exceeded, requests fail immediately without calling the service
  3. Half-Open: Testing if the service has recovered by allowing limited requests

When to Use

Ideal scenarios:

Avoid when:

Implementation in Go

package circuitbreaker

import (
    "errors"
    "sync"
    "time"
)

type State int

const (
    StateClosed State = iota
    StateOpen
    StateHalfOpen
)

type CircuitBreaker struct {
    mu              sync.RWMutex
    state           State
    failureCount    int
    successCount    int
    lastFailureTime time.Time
    
    // Configuration
    maxFailures     int
    timeout         time.Duration
    halfOpenMaxSuccesses int
}

func New(maxFailures int, timeout time.Duration) *CircuitBreaker {
    return &CircuitBreaker{
        state:                StateClosed,
        maxFailures:          maxFailures,
        timeout:              timeout,
        halfOpenMaxSuccesses: 2,
    }
}

func (cb *CircuitBreaker) Call(fn func() (interface{}, error), fallback func() (interface{}, error)) (interface{}, error) {
    cb.mu.Lock()
    state := cb.state
    
    // Transition from Open to Half-Open if timeout expired
    if state == StateOpen && time.Since(cb.lastFailureTime) > cb.timeout {
        cb.state = StateHalfOpen
        cb.successCount = 0
        state = StateHalfOpen
    }
    cb.mu.Unlock()
    
    // Fast fail if circuit is open
    if state == StateOpen {
        if fallback != nil {
            return fallback()
        }
        return nil, errors.New("circuit breaker is open")
    }
    
    // Execute the function
    result, err := fn()
    
    cb.mu.Lock()
    defer cb.mu.Unlock()
    
    if err != nil {
        cb.onFailure()
        if fallback != nil {
            return fallback()
        }
        return nil, err
    }
    
    cb.onSuccess()
    return result, nil
}

func (cb *CircuitBreaker) onSuccess() {
    if cb.state == StateHalfOpen {
        cb.successCount++
        if cb.successCount >= cb.halfOpenMaxSuccesses {
            cb.state = StateClosed
            cb.failureCount = 0
        }
    } else {
        cb.failureCount = 0
    }
}

func (cb *CircuitBreaker) onFailure() {
    cb.failureCount++
    cb.lastFailureTime = time.Now()
    
    if cb.failureCount >= cb.maxFailures {
        cb.state = StateOpen
    }
}

Usage example:

cb := circuitbreaker.New(5, 60*time.Second)

// With fallback to cached data
result, err := cb.Call(
    func() (interface{}, error) {
        return fetchFromAPI()
    },
    func() (interface{}, error) {
        return fetchFromCache(), nil
    },
)

Implementation in Python

from enum import Enum
from datetime import datetime, timedelta
from typing import Callable, Optional, TypeVar, Generic
import threading

T = TypeVar('T')

class State(Enum):
    CLOSED = 1
    OPEN = 2
    HALF_OPEN = 3

class CircuitBreaker(Generic[T]):
    def __init__(
        self, 
        max_failures: int = 5,
        timeout: timedelta = timedelta(seconds=60),
        half_open_max_successes: int = 2
    ):
        self._state = State.CLOSED
        self._failure_count = 0
        self._success_count = 0
        self._last_failure_time: Optional[datetime] = None
        self._max_failures = max_failures
        self._timeout = timeout
        self._half_open_max_successes = half_open_max_successes
        self._lock = threading.RLock()
    
    def call(
        self, 
        fn: Callable[[], T],
        fallback: Optional[Callable[[], T]] = None
    ) -> T:
        with self._lock:
            state = self._state
            
            # Transition from OPEN to HALF_OPEN
            if (state == State.OPEN and 
                self._last_failure_time and
                datetime.now() - self._last_failure_time > self._timeout):
                self._state = State.HALF_OPEN
                self._success_count = 0
                state = State.HALF_OPEN
        
        # Fast fail if circuit is open
        if state == State.OPEN:
            if fallback:
                return fallback()
            raise Exception("Circuit breaker is open")
        
        # Execute the function
        try:
            result = fn()
            with self._lock:
                self._on_success()
            return result
        except Exception as e:
            with self._lock:
                self._on_failure()
            if fallback:
                return fallback()
            raise
    
    def _on_success(self):
        if self._state == State.HALF_OPEN:
            self._success_count += 1
            if self._success_count >= self._half_open_max_successes:
                self._state = State.CLOSED
                self._failure_count = 0
        else:
            self._failure_count = 0
    
    def _on_failure(self):
        self._failure_count += 1
        self._last_failure_time = datetime.now()
        
        if self._failure_count >= self._max_failures:
            self._state = State.OPEN

Usage with decorators:

# Create a circuit breaker instance
user_service_cb = CircuitBreaker(max_failures=3, timeout=timedelta(seconds=30))

def get_user_from_api(user_id: int) -> dict:
    # API call implementation
    pass

def get_user_from_cache(user_id: int) -> dict:
    # Cache fallback implementation
    return {"id": user_id, "name": "Cached User", "stale": True}

# Use with fallback
user = user_service_cb.call(
    lambda: get_user_from_api(123),
    fallback=lambda: get_user_from_cache(123)
)

Fallback Strategies

1. Cached Response

Return stale data from cache, marked as potentially outdated:

def with_cache_fallback(cache_key: str):
    return lambda: {
        **cache.get(cache_key),
        '_stale': True,
        '_cached_at': cache.get_timestamp(cache_key)
    }

2. Default Value

Return a safe default when the service is unavailable:

func defaultUserFallback() (interface{}, error) {
    return &User{
        ID: 0,
        Name: "Guest",
        Permissions: []string{"read"},
    }, nil
}

3. Degraded Functionality

Reduce functionality but keep core features working:

def degraded_search_fallback(query: str):
    # Use simpler, local search instead of full-featured API
    return simple_local_search(query, max_results=10)

4. Queue for Later

Queue the request for async processing when service recovers:

func queueForLaterFallback(request Request) (interface{}, error) {
    queue.Enqueue(request)
    return &Response{
        Status: "queued",
        Message: "Your request will be processed when service recovers",
    }, nil
}

5. Alternative Service

Route to a backup service or data source:

def alternative_service_fallback():
    # Try secondary API or data source
    return backup_api_client.get_data()

ReactJS Implementation for Frontend

import { useState, useEffect, useCallback } from 'react';

enum CircuitState {
  CLOSED,
  OPEN,
  HALF_OPEN,
}

interface CircuitBreakerConfig {
  maxFailures: number;
  timeout: number;
  halfOpenMaxSuccesses: number;
}

class FrontendCircuitBreaker {
  private state: CircuitState = CircuitState.CLOSED;
  private failureCount = 0;
  private successCount = 0;
  private lastFailureTime: number | null = null;
  private config: CircuitBreakerConfig;

  constructor(config: CircuitBreakerConfig) {
    this.config = config;
  }

  async call<T>(
    fn: () => Promise<T>,
    fallback?: () => T | Promise<T>
  ): Promise<T> {
    // Check if we should transition from OPEN to HALF_OPEN
    if (
      this.state === CircuitState.OPEN &&
      this.lastFailureTime &&
      Date.now() - this.lastFailureTime > this.config.timeout
    ) {
      this.state = CircuitState.HALF_OPEN;
      this.successCount = 0;
    }

    // Fast fail if circuit is open
    if (this.state === CircuitState.OPEN) {
      if (fallback) {
        return await fallback();
      }
      throw new Error('Circuit breaker is open');
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      if (fallback) {
        return await fallback();
      }
      throw error;
    }
  }

  private onSuccess() {
    if (this.state === CircuitState.HALF_OPEN) {
      this.successCount++;
      if (this.successCount >= this.config.halfOpenMaxSuccesses) {
        this.state = CircuitState.CLOSED;
        this.failureCount = 0;
      }
    } else {
      this.failureCount = 0;
    }
  }

  private onFailure() {
    this.failureCount++;
    this.lastFailureTime = Date.now();

    if (this.failureCount >= this.config.maxFailures) {
      this.state = CircuitState.OPEN;
    }
  }

  getState(): CircuitState {
    return this.state;
  }
}

// React Hook
export function useCircuitBreaker<T>(
  apiFn: () => Promise<T>,
  fallbackFn?: () => T,
  config: CircuitBreakerConfig = {
    maxFailures: 3,
    timeout: 30000,
    halfOpenMaxSuccesses: 2,
  }
) {
  const [cb] = useState(() => new FrontendCircuitBreaker(config));
  const [data, setData] = useState<T | null>(null);
  const [error, setError] = useState<Error | null>(null);
  const [isLoading, setIsLoading] = useState(false);
  const [circuitState, setCircuitState] = useState(CircuitState.CLOSED);

  const execute = useCallback(async () => {
    setIsLoading(true);
    setError(null);

    try {
      const result = await cb.call(apiFn, fallbackFn);
      setData(result);
      setCircuitState(cb.getState());
    } catch (err) {
      setError(err as Error);
      setCircuitState(cb.getState());
    } finally {
      setIsLoading(false);
    }
  }, [apiFn, fallbackFn, cb]);

  return { data, error, isLoading, execute, circuitState };
}

Usage in a React component:

function UserProfile({ userId }: { userId: number }) {
  const { data, error, isLoading, execute, circuitState } = useCircuitBreaker(
    () => fetch(`/api/users/${userId}`).then(r => r.json()),
    () => ({ id: userId, name: 'Guest', cached: true }),
    { maxFailures: 3, timeout: 30000, halfOpenMaxSuccesses: 2 }
  );

  useEffect(() => {
    execute();
  }, [userId, execute]);

  if (circuitState === CircuitState.OPEN) {
    return <div className="alert">Service temporarily unavailable. Showing cached data.</div>;
  }

  if (isLoading) return <div>Loading...</div>;
  if (error) return <div>Error: {error.message}</div>;

  return <div>User: {data?.name}</div>;
}

Trade-offs

Advantages

Disadvantages

Best Practices

  1. Service-specific configuration: Tune thresholds based on each service’s SLA
  2. Observability: Emit metrics for circuit state changes and fallback invocations
  3. Graceful fallbacks: Always provide meaningful fallback responses when possible
  4. Timeout integration: Combine with timeouts to prevent hanging on slow services
  5. Testing: Test circuit breaker transitions in integration tests
  6. Documentation: Document fallback behavior for API consumers

Conclusion

The Circuit Breaker pattern with fallback strategies is essential for building resilient distributed systems. By preventing cascading failures and providing graceful degradation, it enables systems to maintain partial functionality during outages rather than complete failure.

For principal engineers, implementing circuit breakers across service boundaries is a key architectural decision that significantly improves system reliability and user experience during failure scenarios.