Shifts in the performance balance impacts users as their applications hit different bottlenecks. To squeeze out maximal performance, an application must balance demand for the various components. Sweeping optimization of an application to a specific computer system is uneconomical for most software, however. It reduces portability between systems while, with rapidly developing hardware, software porting is the most promising strategy for achieving long term high performance. Instead, developers rely on a software layer between applications and physical resources, the operating system (OS), to abstract said resources to maintain portability, and to do so at minimal cost. The OS must try to balance the goals of extracting maximal performance on a wide range of computer systems and requesting minimal to no involvement from application software. In the worst case, legacy operating system design can exacerbate performance bottlenecks and artificially reduce application throughput by orders of magnitude [HHL+97]. To avoid such situations, operating system design itself needs to adapt in concert with the underlying hardware.
This appendix examines the effects of ongoing computer architecture developments on the prominent class of streaming I/O applications. We plot the recent history of hardware evolution and extrapolate to the near future (based on industry predictions), to be able to understand why the scaling of application performance with Moore's Law is increasingly unlikely with the present system abstractions.
Effortless application scaling with Moore's law has effectively ended as a result of four closely related developments in computer architectures. More and more applications are hitting what has become known as the ``Memory Wall'' [WM95]. At the same time, sequential applications see performance growth stalling as processor development moves from clock frequency scaling to on-chip parallelism. Mass parallelism is beginning to spark core heterogeneity [ABC+06a,LBKH07] and as a result processor diversity.