Functions-as-a-Service platforms—AWS Lambda, Google Cloud Run, Azure Functions—offer near-infinite scale, but they shift risk elsewhere.

Key Questions

  • Downstream Readiness: Can your databases, queues, and third-party APIs absorb bursty traffic without throttling?
  • Cost Predictability: Model worst-case concurrency. A sudden spike can multiply your bill in minutes.
  • Portability: Platform-specific features (event payloads, IAM models) complicate multi-cloud or on-prem fallback strategies.
  • Platform Bugs: Outages happen. Build incident runbooks with clear escalation paths to cloud support and temporary mitigation steps (e.g., circuit breakers, fallback services).
  • Implement rate limiting and back-pressure to protect downstream systems.
  • Tag all functions with cost allocation labels and configure budget alerts.
  • Abstract business logic into portable modules so only thin adapters depend on provider-specific runtimes.
  • Run chaos tests that simulate throttling, cold starts, and provider-side failures.