Less Is More, But Too Little Ain't Enough

Less Is More, But Too Little Ain't Enough

Disclaimer: This is an editorial (or if you prefer, fan-fiction) piece that discusses the development of a number of important technologies and some of the individuals who participated in them. Many of these eminent colleagues are still active. The history I relate is based on my own recollections and anecdotes I have heard, so it may differ from your own knowledge or experience. I apologize in advance for errors or misstatements, and I am very open to correction and perhaps adding alternative accounts of this history. If I've got it wrong, please help me to understand your point of view.

From an early point in the development of computer systems, modularity has been an important tool for managing complexity. Layering is a particular form of modularity, in which certain "lower layer" modules are used to manage fundamental resources and implement general tools, and "higher layer" applications are expressed using them. This leads to a tension: should the lower layers be "fat", providing complex tools which allow the implementation of applications to be simple? Or should lower layers be "thin", providing simple tools which allow greater flexibility in the implementation of applications?

Once a lower layer has been defined, there is an incentive to make its implementation more complex while maintaining as much backward compatibility as possible. This allows the implementation of applications to be largely preserved while obtaining the benefits of new implementation strategies.

On the other hand, keeping the lower layer simple and pushing complexity into the implementation of higher layers allows for greater longevity of low level implementations. It also avoids problems caused by the incorporation into lower layers of complexity which does not serve all applications well, or which becomes obsolete.

In the 1970s and 1980s there were a number of very influential designs that incorporated the latter strategy, keeping lower layers simple and requiring more complexity in higher layers. These included the C language, the Unix kernel, Reduced Instruction Set Computing (RISC), the Internet Protocol Suite and Redundant Array of Inexpensive Disks (RAID). In each case there were serious questions raised about whether keeping the lower layer simple would impose costs on higher layers that would make cost/performance prohibitive. There were also questions about whether low level simplicity might prove incapable of equaling the reliability and ease of use of more complex designs. In each case, the less-is-more approach had to be justified and then defended against the introduction of more complex features over time.

Ken Thompson did not have to justify the minimalist design of the Unix kernel to anyone - he sat down and wrote 10,000 lines of code. Once Unix was widely adopted within Bell Labs, he did however have to defend his design against extensions to the kernel requested by application developers. The anecdote I heard was that when requests were made to the Unix Development Group for such extensions, it was Ken's job to say "no". He had to be convinced that there was an actual lack of expressiveness before an extension would be considered. He did not see it as the operating system's job to make the work of application developers easier. But he could not defend Unix against extensions introduced by the Berkeley Software Distribution (BSD) group.

John Hennessy and David Patterson used extensive analysis to predict the performance of RISC processors, and were able to start a company to build one. The principles of RISC design have been appropriated as a simple microengine supporting the implementation of a more complex instruction set architecture. The effectiveness of such strategies has led to the wide use of RISC principles while the actual instruction set architecture has survived but not dominated.

The less-is-more strategy became a defining aspect the Internet Protocol Suite when the hop-by-hop techniques adopted by Vinton Cerf and Bob Kahn in the early ARPANET proved unable to provide end-to-end reliability. Applying a less-is-more strategy to the delivery of datagrams in the wide area eliminated the requirement of high reliability from datagram delivery layer of communication (IP), pushing it up to the transport layer (TCP). This decision was not an obvious one to Cerf, who had to be convinced to adopt it in a meeting of ARPAnet architects at Marina Del Rey (year?). The meeting is recounted in this discussion between Cerf and Reed: https://www.nethistory.info/Archives/tcpiptalk.html

Although it was not easy, it was found that many other functions, including flow and congestion control, could be similarly pushed up to higher layers. The simplicity that this allowed in datagram delivery was particularly important because of the difficulty of controlling the characteristics of a decentralized global infrastructure distributed across physical and administrative barriers.

Using the less-is-more strategy, the Internet began to grow explosively and was adapted to a wide variety of implementation environments. Those who had adopted similar strategies did not tend to write a lot about the intuition behind their technical methods. Many of them were practitioners or highly skilled engineers, and either wrote few articles or expressed their ideas in narrow technical terms rather than broad generalizations. But not David Reed.

Then an MIT graduate student, Reed had a particular way of explaining the less-is-more principle in a way that highlighted the common impulse behind many disparate design efforts. Examples could be found in the work of his professor, Jerry Saltzer, reaching back to the dawn of operating systems. And it was also found in their collaborative work with researcher David Clark on the Internet. Together they wrote a paper, End-to-End Arguments in System Design, which sought to bring these disparate ideas together in a unifying framework.

https://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf

The impact of this paper was huge. It is one of the most cited papers in applied computer science. End-to-end arguments were for decades and still are used as a means to defend the simplicity of the Internet's datagram delivery service. When development groups come to the Internet Architecture Board to ask for extensions, the End-to-End Arguments say "no".

As powerful as these arguments have been in the effort to preserve simplicity, the paper itself has proven difficult to interpret. It makes explicit some concepts that may have been implicit within the system design community, but it also makes assumptions about the structure of wide area systems and uses terms that it does not define. While Reed has always insisted that all that is necessary to obtain correct design guidance is to read the paper, interpretations have varied widely.

To muddy the waters further, the central principle of the paper is itself qualified by broad exceptions to the strict application of the less-is-more concept (performance and security), and these have exceptions have been added onto over the decades in articles published by the authors. In addition, some dire predictions which claimed to be based on End-to-End Arguments, such as that Network Address Translation would spell the end of Internet scalability, have not been bourn out in practice.

Criticism of the End-to-End Arguments paper has been used to dismiss the less-is-more approach to?network design. Some see the paper as a post-hoc attempt?to appropriate a core element of a widely used design concept, give it a name and claim ownership of it. Some point to the use of end-to-end arguments to quash new directions in Internet architecture as evidence of self-interest in those who apply them. It is still a kind of orthodoxy among some Network architects, but it is increasingly ignored in industry and by those seeking new directions.

My point of view is that the End-to-End Arguments touch on a fundamental reality of layered system architecture, and should not be ignored. However, there are aspects of the original paper that point to ways in which the less-is-more principle might be rejuvenated and made more relevant to current design choices.

The first is the reconsideration of this assumption, made in one of the first paragraphs of the paper:

"In a system that includes communications, one usually draws a modular boundary around the communication subsystem and defines a firm interface between it and the rest of the system."

This may be an accurate statement of usual practice, but it does not explain why such a boundary should be drawn. If one questions the role of pure communication as the common layer to which storage and processing are connected, then the issue of what "less-is-more" means in the common layer can be revisited.

Historically, concatenating local links through datagram forwarding to create end-to-end wide area paths was an expedient and powerful step. This approach leaves almost no room for policy or alternative services that make use of storage or processing. To many such "promiscuous forwarding" may seem like the only feasible approach to create a global information and communication environment. It has also led to a common digital infrastructure that is unsafe and almost impervious to fundamental change.

An alternative is building more general but still constrained local services which are then combined (federated) on the basis of policy. This approach is more general and has the potential to be safer and more capable of serving the public good. Access to distributed data or remote invocation of operations implemented more locally need not be clients of a global communication infrastructure. They can instead be used as tools to build globally distributed services through federation. Local control is a two edged sword - it can be used to impose oppressive rules and may result in global anarchy; but it can also be used to ensure the self-determination of independent communities.

The second is the question of what "less" and "more" really mean. The paper talks about the "placement of a function" within the layers and its "complete and correct" implementation. Different design examples have different notions of what constitutes a "function", and there is rarely a clear notion of when an implementation of an unreliable function is "complete and correct". Ken Thompson was particularly insistent on "minimality" and "orthogonality" in his design of the Unix kernel. At times the Internet Protocol has been approvingly termed "dumb". In RISC one important metric was the amount of hardware devoted to control and datapaths vs cache memory. In this piece I have used the terms "simple" and "complex" but they are really placeholders for a number of different possible ways of understanding what it means to "keep it simple".

I have written a paper based on a formal model of layering which characterizes layers according to their "logical strength". This notion is derived from the idea that any module has, in principle, a specification that can be expressed in some form of logic (although a distributed system may require a complex form of "modal" logic to describe in full). Using logical strength derives from the fact that any specification makes guarantees which can be expressed as logical statements. Some such statements are stronger than others, meaning that they make more guarantees.

Using this characterization, I was able to prove a simple result: a weaker module may support fewer applications, but it can itself be supported by more underlying implementations. This result, which I named "The Hourglass Theorem", provides a specific guideline for the design of a common module: it should be as weak as possible which still being able to support the class of applications that are deemed necessary by the designer.

The Hourglass Theorem. Artist Terry Moore.

"On The Hourglass Model", Micah Beck, Communications of the ACM, July 2019, Vol. 62 No. 7, Pages 48-57.

Interpreting "less" as "weaker" provides a possible way to design a module that can have many implementations without necessarily being limited to communication. If we consider weak forms of storage (e.g. leases, limited size and best-effort) and computation (e.g. limited CPU cycles, memory and best-effort) then we may be able define a common layer that is weak but is still able to to support client services other than point-to-point communication.

The question of what would be gained if a way of converging data movement, storage and processing could be universally and easily deployed is answered, in part, by considering how adopting a communication-only network has shaped our current information and communication environment. In a recent paper we argue that the exclusion of distributed storage and processing from the services provided to clients by the Internet Protocol Suite has contributed directly to the rise of Content Delivery Networks and Distributed Cloud as expensive, private alternatives which are defined and controlled by hypergiant corporations instead of the broader Networking community.

How We Ruined The Internet, Micah Beck, Terry Moore,?June 2023

In that sense, the success of of the Internet as a unicast communication service may be due in part to taking logical weakness (not providing storage or processing services) so far that it cannot meet the requirements of the application community except through augmentation by private infrastructure. To little ain't enough.




要查看或添加评论,请登录

社区洞察

其他会员也浏览了