What belongs in an Operating System?

Recently I got addicted to the Youtube videos of the interviews and talks by the Bells Lab titans. Especially on the history of UNIX. They give me such a warm and fuzzy feeling! It's fascinating to learn how and why they made their choices, you learn a lot. One of the great talks on the subject I recently ran into was by Rob Pike https://www.youtube.com/watch?v=_2NI6t2r_Hs. I've been watching his videos on the Golang a lot. But this is the first time I heard him talk about both Unix and his personal histories. Remarkable video, deeply touching.

I was surprised to learn that he started on an IBM/360 writing in PL/C (subset of PL/I (https://en.wikipedia.org/wiki/PL/C). PL/I is where I started. With the punch cards. On a Soviet knock off of an IBM/360 known in US as the Ryad line (https://en.wikipedia.org/wiki/ES_EVM). And I still have a soft spot for the IBM 360/370 and what followed: MVS/XA, MVS/ESA, which was the last major version of the magnificent technology that I worked with some 30 years ago. To this day I consider this line the best software product ever engineered and built and read everything by Fred Brooks I could lay my hands on.

Just as Rob I entered the UNIX world after the mainframes, only it was a bit later, in mid 1990s, during the migration from the mainframes to the client-server craze. Specifically, Solaris and HP-UX. Only, unlike Rob, I wasn't excited at all. I was dismayed. Comparing to IBM's MVS UNIX felt utterly deficient. The first thing that I could not believe was missing was the indexed files. Yes, the UNIX files are just dumb byte sequences. Hierarchical organization is nice, but how do I do something like this? It's reading a record with the key value 'foo' from an indexed file into a struct, if it's not obvious.

DCL INPUT FILE RECORD DIRECT ENV(REGIONAL(3))KEYED, 
INREC CHAR(160) DEF INRECX,
1 INRECX UNALIGNED,
	2 PADA CHAR(23),
	2 PKDEC FIXED DECIMAL (11),
	2 PADB CHAR(21),
	2 FLTNUM FLOAT(16),
	2 PADC CHAR(13),
	2 BINNUM FIXED BINARY(16),
	2 PADD CHAR(26),
	2 CHTER CHAR(8),
	2 PADE CHAR(51),

READ FILE(INPUT) INTO(INRECX) KEY('foo');        

I took this for granted in my favorite PL/I (still my favorite language. Tops C and Golang for me, which I also love). The answer was "use an RDBMS". Really? Why do I need RDBMS if I don't need any R? Besides, in those days there were yet no free MySQL or Postgres (they only appeared in the second half of 1990s and who wants to use immature DBs in production?). While the vendor DB licenses cost arm and leg. From 2005, when the prices dropped: ORACLE

"For example, a 4-way, dual core processor server which previously had a list license fee of $320,000 (4*2 [cores] $40,000) would now have a list license fee of $240,000 (0.75 8 [cores] *$40,000)."https://www.dba-oracle.com/t_licensing_pricing.htm

Or, from an interview with Larry Ellison from 2001

We want to make a comparison between Oracle and IBM as easy as possible. It's simple. This is our price, $40,000 per processor. They're priced at $20,000 per processor, but look at all of the things that [DB2] doesn't include. https://www.computerworld.com/article/2582598/q-a--ellison-responds-to-users--ibm-on-database-pricing.html

Even today:

Sybase SQL Anywhere Advanced Edition, 1 Chip license, MSRP:$21,999.00 https://texas.gs.shidirect.com/product/29713096/Sybase-SQL-Anywhere-Advanced-Edition

This was a shock and a complete paradigm shit. I participated in many projects that heavily used the indexed flat files. I could not imagine that a server OS - I was already familiar with MS DOS - would lack this feature, which I considered to be absolutely essential. Sure there were products on the market that would deliver this functionality. The Informix's C-ISAM comes comes to mind (IBM bought Informix in 2001, I think). But I expected a serious OS to offer it out of the box.

Here is what I expected from an operating system, coming from the IBM MVS environment:

  • compilers for multiple popular languages
  • feature rich file system with the indexed access
  • multitasking/multiprocessing/concurrency. Oh, IBM/360 PL/I had concurrency implemented in the language itself. Did you hear that, Go? It looked like this (the example is from here https://teampli.net/mirrors/robinv/enterp.htm). The year of the PL/I(F) manual that I found where this is covered is 1968. And it's the fourth edition.

TASKER:PROC OPTIONS(MAIN REENTRANT);
        DCL TASKS( 2 )          EVENT;
        DCL TASK_READY(2)       EVENT;
        DCL TASK_COMPLETE(2)    EVENT;
        DCL THE_TASKS(2)        ENTRY           INIT(TASK1,TASK2};
        DCL NUMBER_OF_TASKS     FIXED BIN(15)   INIT(2);
        DCL TASK_INDEX          FIXED BIN(15);
        DCL (STATUS,COMPLETION) BUILTIN;

        DO TASK INDEX = 1 TO NUMBER_OF_TASKS;
                STATUS( TASK_READY(TASK_INDEX ) ) = 0;
                CALL THE_TASKS( TASK_INDEX ) ) EVENT( TASKS( TASK_INDEX ) );
                COMPLETION( TASK_READY (TASK_INDEX ) ) = '1'B;
        END;

        WAIT( TASK_COMPLETE(1), TASK_COMPLETE(2) );

        DO TASK_INDEX = 1 TO NUMBER_OF_TASKS;
                COMPLETION( TASK_COMPLETE (TASK_INDEX ) ) = '0'B;
                STATUS( TASK_READY( TASK_INDEX ) ) = 4;
                COMPLETION( TASK_READY( TASK_INDEX ) ) = '1'B;
                WAIT( TASKS( TASK_INDEX ) );
        END;
END;        
Oh, I could go on and on about PL/I. How about polymorphism?

DCL E GENERIC
       (El WHEN (*) , E2 WHEN (*,*));
DCL EI ENTRY (FIXED) EXTERNAL;
DCL E2 ENTRY (FIXED, FLOAT) EXTERNAL;         

  • Robust scripting languages and tools to create UI. The mainframe had CLIST, REXX and TSO/ISPF with panels. Rob in his video kicked the TSO: he said it took him 20 minutes to login. I had no such issues but I was working with it some 15 years later.
  • Networking. IBM had SNA. But UNIX ushered tcp/ip, which gave birth to the Internet we have today. So, UNIX wins in the long run.
  • Obviously memory management, CPU scheduling, drivers, etc. The stuff UNIX kernel does.
  • Security and user management. I think I will stop here.

So, I read a bit on the history and philosophy of UNIX. I learned that Ken Thompson wrote the initial UNIX distro within 3 weeks, while his wife was away. Gosh, I thought, I wish she stayed home :-) So, the initial UNIX distro with an assembler, an editor, a file system, and a handful of commands is an operating system? It did have the task scheduler: Brian Kernighan recalled that the original PDP-7 had 2 terminals connected to it. But.. it's like calling a bicycle a car! Not even a single compiler! But clearly he and Bell Labs were onto something since it picked up as a wild fire. I guess it was mostly driven by the low cost and the relative ease of extension by the time it included the C compiler. As Brian Kernighan recalled in another video on Unix history, up until 1973 it was exclusively Assembler. The physical size likely too played a role: you cannot put a mainframe on a nuclear sub.

I guess the years of working on the Multics OS made the pendulum swing and the guys went towards extreme minimalism. Hence the name UNIX: uni instead of multi. Speaking of Multics. Multics was written in PL/I (https://www.multicians.org/pl1-raf.html). I had a lucky opportunity to work with a descendant of Multics - STRATUS VOS operating system (https://en.wikipedia.org/wiki/Stratus_VOS) - for quite a few years. Now, this is what I would call an adult operating system. Not only it has the indexed files, it also has the physical persistent queues implementation at the OS level. A subset of PL/I was also the language of choice in VOS. Unless u did telecom. That traditionally was done in C. Another difference with the IBM PL/I was that many features that IBM PL/I provided through the language itself became system calls in VOS PL/I. For example, writing a message to a synchronous bidirectional queue and getting a response would look like

call s$call_server( port_id,
                             msg_priority,
                             msg_subject,
                             msg_length_in,
                             msg,
                             reply_length_in,
                             reply_length_out,
                             reply,
                             error_code);        

So, should an OS be minimalistic or full fledged and feature rich?

Jonathan Blow, whose rants I enjoy immensely and with whom I agree most of the time, believes that Linux is too huge and that a good OS should contain a fraction of the stuff it does (https://www.youtube.com/watch?v=k0uE_chSnV8). Hm, the drivers in the user space? I don't see how that can work with, say, NIC cards.

Cloud computing and containerization change the paradigm. Journey back to mainframes.

Suddenly the many services I expected from a decent OS are available again but they live outside of the container, box and its OS. AWS SNS/SQS queues or MSK - Amazon Hosted Kafka or Amazon MQ - Hosted ActiveMQ or RabbitMQ; RDS with MySQL or Postgres are quite an affordable replacement for ISAM, in addition to providing the RDMS functionality; S3 for flat files; etc. It's a container orchestrator - Nomad or Kubernetes (that clouds also provide as hosted out of the box solution) that becomes your OS. You don't really care that much about what Linux provides to you and how it structured any more. You specify the resources for your container in a Kubernetes manifest, almost same way you used to do it in the IBM/360 JCL: From an IBM manual published in 1971

//N JOB  REGION=90K, TIME=( 4,. 30),.MSGLEVEL=(2. 0)

The TIME Parameter TIME= 1 (minutes,seconds)
1440 minutes specifies the maximum number of minutes the job can use the cpu. The 'number of minutes must be less than 1440 (24 hours).

REGION=valueK
valueR
specifies the number of contiguous 1024-byte areas of main storage
to be allocated to each job step. The number can range from one to
five digits but may not exceed 16383.         

Your k8s persistent volumes and volume claims are beginning to resemble

//STEP1 EXEC PGM=CONVERT
//INPUT1 DD DSNAME=A.B.C, DISP=OLD
//INPUT2 DD DSNAME=FILE, DISP=OLD, UNIT=2400, VOLUME=SER=54333
//BUF DD UNIT=2400, SEP=(INPUT1,INPUT2)
//OUTPUT DD DSNAME=ALPHA" UNIT=TAPE"DISP= C, KEEP), AFF=BUF         

That 'SEP=(INPUT1,INPUT2)' in the above DD statement tells the system to use an I/O channel separate from those used by the INPUT1 and INPUT2 datasets. Yes, IBM/360 had I/O channels, basically full fledged computers with own Assembly language.


So, what's the future looks like? I expect the abstractions for most Cloud services to mature and expose some stable and well known interfaces. That would bring about the libraries and packages with well understood semantics that will make writing to a mounted volume or a s3 bucket look identical. And who knows, maybe eventually Go people, smart as they are, will include this semantics directly into the language itself, just like they did with concurrency. So that your code will look as beautiful as

DCL INPUT FILE RECORD DIRECT ENV(REGIONAL(3))KEYED, 

READ FILE(INPUT) INTO(INRECX) KEY('foo');        

So, the whole cloud with all its services abstracted out as well known and understood interfaces - and supported by your language of choice, either via language constructs or via well defined and widely known libraries/packages - becomes your operating system. Why should your concurrency be confined to a single box? Those beautiful go routines could be run on other boxes, why not?

The future is bright. It looks and feels like an IBM Mainframe! And Linux can be as minimal as possible. It should be in this scenario.

Scott Guyer

Director, Enterprise Architecture ? Strategic Planning ? Architecture Governance ? Cyber Architecture ? Team Leadership

1 年

Heck yeah, brother!

回复

要查看或添加评论,请登录

Yuri Nakhshin的更多文章

社区洞察

其他会员也浏览了