Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Why Are Distributed Systems So Hard?

USENIX via YouTube

Overview

This course aims to explain why distributed systems are challenging to manage. By covering the history of distributed computing, debunking myths about the CAP theorem, and exploring the inevitability of network partitions, learners will understand the complexities involved. The course also delves into popular consensus algorithms and designing systems for adaptability by considering human factors. The teaching method includes lectures on topics such as shared nothing architecture, unreliable message delivery, and consensus algorithms. This course is intended for individuals interested in understanding the intricacies of distributed systems and how to mitigate associated risks.

Syllabus

Introduction
Agenda
Storytime
Data Evolution
Scaling
Cloud Computing
Why Scale Horizontally
What Does It Mean To Run A Distributed System
A Node On Distributed Computing
Summary
Shared Nothing Architecture
Unreliable Message Delivery
Why Are We Fenced Off
Building Observability
What We Can Know
The Cap Theorem
C
Replication Lag
Consistency is a Spectrum
Availability is Not Binary
Partition Tolerance
Hardware
Hardware Failure
Cables
Sharks
Kevlar
Network Partitions
Resource Isolation
Process Suspension
Network Glitch
People do bad things
Why does this matter
Practical reality
The correctness result
Mitigation strategies
Consensus Algorithms
The Woods Theorem
Building Mental Models
Incident Analysis
Blameless Discussions
Mental Models
Human Failure
Alert Fatigue
User Mindsets
Designing Systems for Humans
HugOps

Taught by

USENIX

Reviews

Start your review of Why Are Distributed Systems So Hard?

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.