Computing Scaling Laws

Latency, defined as the delay between transmiting a signal in one end and receiving on the other end, is bounded by two fundamental factors: the physical propagation of signals and the processing overhead at each stage.

Figure 6: Source: Why is everyone in such a rush?, Latency Numbers Every Programmer Should Know

Electrical signals in copper travel at roughly 60–70% of the speed of light, while optical signals in fiber travel at roughly 65% of the speed of light. At transatlantic scales, fiber link has a theoretical minimum of ~40ms imposed by the speed of light alone, before any processing occurs!

Within a processor, latency is determined by circuit depth and physical distance. Register access is measured in fractions of a nanosecond. Cache latency grows with distance from the core: L1 is on-die and fast, L3 is shared across cores and slower. RAM introduces additional delay from DRAM row activation and bus arbitration.

Moving onto storage, magnetic HDDs are dominated by seek time, where a physical read head has to move. SSDs eliminate mechanical movement but introduce controller processing and NAND flash charge sensing. NVMe reduces bus overhead by connecting directly to the CPU via PCIe.

Finally, on distant networks, each hop introduces propagation delay plus processing at routers, switches and protocol stacks. Wireless links add radio encoding and scheduling overhead. Satellite (GEO) latency is almost entirely propagation — signals must travel ~72,000km round trip to geostationary orbit.

We can see that latency spans ten orders of magnitude from CPU to satellite communication, so optimising a system can often begin by identifying which communication layer is a bottleneck.