SLA vs SLO vs SLI: The Alphabet Soup of Reliability Explained

Stop confusing these acronyms. A clear, practical guide to defining Service Level Indicators, Objectives, and Agreements for your team.

J
Jesus Paz
2 min read

If you walk into a DevOps interview, you will be asked the difference between SLI, SLO, and SLA. Here is the definitive guide, with no fluff.

The Pyramid of Reliability

Think of them as layers.

  • SLI (Indicator): The raw number.
  • SLO (Objective): The internal goal for that number.
  • SLA (Agreement): The legal penalty if you miss that goal.

1. SLI (Service Level Indicator)

“What are we measuring?” An SLI is a concrete metric that tells you how the service is performing right now.

  • Good SLI: “The percentage of HTTP requests to /api/checkout that return 200 OK within 500ms.”
  • Bad SLI: “How happy users are” (Too vague).

Formula: (Good Events / Total Events) * 100

2. SLO (Service Level Objective)

“What is our target?” This is an internal goal set by the Engineering and Product teams. It balances reliability with velocity. If you set an SLO of 100%, you can never deploy updates (because deployments carry risk).

  • Target: “99.9% of requests should be successful.”
  • Error Budget: This means 0.1% of requests can fail. This is your “Budget” to break things while experimenting.

Alerting: You should alert on your SLO. “We are burning our Error Budget too fast!“

3. SLA (Service Level Agreement)

“What happens if we fail?” This is a contract between the Business and the Customer. Lawyers write this. It usually involves money.

  • Contract: “If availability drops below 99.5%, we will refund 10% of your monthly subscription.”

Key Insight: Your SLO should always be stricter than your SLA. If SLA is 99.5% and SLO is 99.5%, you have no buffer.

  • Set SLA at 99.5%.
  • Set SLO at 99.9%. This gives you a safety margin to fix issues before you owe customers money.

Summary Table

TermStands ForAudienceExample
SLIIndicatorEngineers”Latency is 120ms”
SLOObjectiveProduct Mgr”Target < 150ms 99% of time”
SLAAgreementLawyers/Customers”If > 200ms, we pay you”

Cluster Uptime helps you track your SLIs (via latency graphs) so you can hit your SLOs and avoid breaking your SLAs.

👨‍💻

Jesus Paz

Founder

Read Next

Join 1,000+ FinOps and platform leaders

Get uptime monitoring and incident response tactics delivered weekly.