Automation

 Computer system complexity and diversity renders optimization tedious, if not outright daunting. Traditional system configuration development is essentially human-centric: documentation is offline and verbose; software interfaces must be manually discovered. Such irregular system structure stimulates manual administration, including optimization. As system complexity and diversity grow, this path becomes increasingly costly, introducing excessive maintenance cost [BDKZ93,SPL03] and suboptimal resource utilization.

Even though hand-crafted programs invariably beat automatically optimized competitors, automation is the preferred route to battling cost deriving from unavoidable and increasing administration complexity. In complex optimization tasks, such as VLSI logic synthesis or aerodynamic structural design, automation has long been commonplace. The use in online decision making support is far less common. Manual optimization is a logical choice for applications that have large development budgets and no need to be portable, such as high performance scientific and business-critical applications that run on dedicated `big-iron' machines, but the task is too difficult and time consuming to ask of all developers for all applications [Pat07]. For portable applications, automatic optimization offers a more cost-effective and robust alternative. While it cannot extract the same level of performance out of a specific machine, it will trivially outperform poorly adapted manual code across a range of systems. A related objection revolves around trust: self-optimizing code is sometimes considered less deterministic than hand-written applications and therefore intrinsically less trustworthy. This position is incorrect. The higher constraints imposed on software structure for self-optimization make the system more understandable: strict logic enables formal code verification at at least one level of operation (the component architecture), an assurance that exceeds trustworthiness of manual coding, which has been shown to have an error rate of around 3 faults per thousand lines of code [Hat97,DTB04].

To offload tedious decision making to support logic, software has to be amenable to introspection from this logic. The principal solution is to restrict system operation to conform to a simple model. A large body of work has accumulated on the architectural requirements of self-managed systems [BCDW04,KM07,OGT+99], many of which advocate a component-based approach [KM07]. System information is not trivial to convey and a semantic gap between model and reality will produce solutions with little use in practice. The challenge thus becomes to select a model with enough practical utility. In system administration today, automation is mostly limited to scripting of frequently used combinations. In the case of (Unix) pipelines, utility is not questioned. Its ubiquity is a good indication of the demand for configuration automation. It is no coincidence that the most common form is in Unix shell scripts. Their simple semantics makes automation straightforward and robust: automation of less structured systems will be much harder. If needed, techniques such as information gap theory (Appendix B.2.1Uncertainty) can establish trust in the optimization outcome even when considerable uncertainty exists in individual input parameters.



Subsections
willem 2010-02-03