I think it is interesting how the evolution of computing has lead to many layers of storage and caching. As a developer you have to keep a mental model of all these layers. As an example below are the approximate latencies for an instance (server) using the different services offered by Amazon Web Services.

CPU register: 0.3ns
L1 cache: 1ns
L2 cache: 3ns
L3 cache: 10ns
Main memory: 30ns
Local SSD: 0.1 ms
Local hard drive: 10ms
Network drive (EBS): 20ms
Object store (S3): 100ms
Tape store (Glacier): 1 hour

Luckily we don’t have to think about all of the above when we are developing an application. The first five layers (from the CPU register up to main memory) can just be thought of as memory for most of the time. And most developers don’t use all services mentioned above, for example an instance on Amazon with SSD doesn’t have a local hard drive.

On the other hand there are many caching layers that application developers have to think about that happen on top of the above layers:

Database caching (Redis and Memcache)
Page caching (Apache / Varnish)
Separate data centers and regions
Content Delivery Networks
Browser caching
Javascript executing (asynchronous and deferred)
Geographic latency (speed of light)
Routing latency (internet backbone)
Connection latency (mobile connections)

And there are even more complex techniques like HTTP streaming, precompiling assets, asynchronous code, client side framworks and websockets. In conclusion, there a lot of Matryoshka dolls to consider when building an application.

Update: Found this awesome time-slide overview of latencies.

The complexity of storage and caching layers

References