I think it is interesting how the evolution of computing has lead to many layers of storage and caching. As a developer you have to keep a mental model of all these layers. As an example below are the approximate latencies for an instance (server) using the different services offered by Amazon Web Services.

  1. CPU register: 0.3ns
  2. L1 cache: 1ns
  3. L2 cache: 3ns
  4. L3 cache: 10ns
  5. Main memory: 30ns
  6. Local SSD: 0.1 ms
  7. Local hard drive: 10ms
  8. Network drive (EBS): 20ms
  9. Object store (S3): 100ms
  10. Tape store (Glacier): 1 hour

Luckily we don’t have to think about all of the above when we are developing an application. The first five layers (from the CPU register up to main memory) can just be thought of as memory for most of the time. And most developers don’t use all services mentioned above, for example an instance on Amazon with SSD doesn’t have a local hard drive.

On the other hand there are many caching layers that application developers have to think about that happen on top of the above layers:

  • Database caching (Redis and Memcache)
  • Page caching (Apache / Varnish)
  • Separate data centers and regions
  • Content Delivery Networks
  • Browser caching
  • Javascript executing (asynchronous and deferred)
  • Geographic latency (speed of light)
  • Routing latency (internet backbone)
  • Connection latency (mobile connections)

And there are even more complex techniques like HTTP streaming, precompiling assets, asynchronous code, client side framworks and websockets. In conclusion, there a lot of Matryoshka dolls to consider when building an application.

Update: Found this awesome time-slide overview of latencies.

References