登录查看更多内容

The Hypocrisy and Hollow Promises of RISC

Paul McKneely

President, technoventure, inc.

发布日期: 2019年5月6日

A heated debate (really a fight for market share) started back in the 1990's called CISC vs. RISC. The debate continues to this day in some circles. The two acronyms are generalized categories that try to distinguish between computer architectures based on the philosophies of their designers (and promoters). What it was really about was that new-comers wanted part (if not all) of the market share enjoyed by the few large semiconductor manufacturers like Intel and Motorola. I would not really care so much about the debate if the claims made by RISC manufacturers about their products were not full of holes and contradictions. Like any philosophical war such as Democracy vs. Communism, the proponents of each side often carefully choose established words to imply things that are not true. This is the essence of propaganda. Though both sides of the aforementioned political debate have used this tool a great deal to promote their respective causes, Communism depends more on disgruntled people believing their lies while Democracy depends more on people discovering the truth about how the real world works and accept that there is a degree of cruelty in a free open market.

Before I get started, let me say that CISC stands for Complex Instruction Set Computer while RISC stands for Reduced Instruction Set Computer. The promoters of RISC brought a list of grievances to the public's attention akin to the one nailed on the chapel door of the Roman Catholic Church in Wittenberg, Germany in 1517 by Martin Luther. Though the CISC vs. RISC debate has also been called a religeous one, the Internet was used instead of a physical door frequently visited by the ruling incumbents. Among the list of claims brought forth by the promoters of RISC, the only one that I think is valid is that RISC architectures are easier and less costly to engineer (though usually more complicated to program) than traditional CISC architectures so they can be brought to market more quickly. And let's face it. The RISC contenders for market share had less money, and “time means money” as they say. So rephrasing the RISC promoters' argument in terms they would never use might sound something like this: “RISC is a quick-and-dirty technology to get to market quickly and cheaply...and you get what you pay for.”

Don't get me wrong. I think that RISC is a very good special-purpose technology which is really not a new idea. The original 8-bit processors such as the MOS 6502 and the Motorola 6809 literally have reduced instruction sets. The problem I have with the RISC agenda is that its promoters threaten to replace my general-purpose computer with a (what they claim to be very fast) special-purpose computer that is difficult to program and only suitable for a narrow set of applications. Without using profanity, what I have to say about that is that (as hillbillies are known to say) “Thems be fightin' words!”

The complexity of an instruction set alone does not represent the full feature set that make up the RISC promoters' manifesto. But let's first look at that claim. Let me say that I have a (bad?) habit of looking deeply into the instruction set of any processor architecture before I decide to spend much time using it. That means that I not only look at the assembly language but I look deep into the machine language (i.e. binary instruction encoding) as well. I've looked at ARM and its instruction set is not simple. For those of you who don't know, ARM is the RISC architecture that makes up the heart of probably 90% of cell phones. ARM became popular (in my opinion) because they did not make chips. They just licensed them to those who do. So this made room for many companies to “jump on the band wagon” and get market share for themselves. ARM is only one kind of RISC processor. There are others such as Alpha, SPARC, PowerPC, Itanium, MIPS etc. But we have arrived at 2019 and the lowly x86 (now in the metamorphic form known as AMD64) still dominates the desktop PC market. The latest contest is the Intel vs. AMD race and it seems that the only strong-hold for ARM is in portable cell phones. Why is this?

While “RISC vs. CISC” implies that any particular processor is either at one end of the spectrum or the other, the reality is that there is a spectrum and any one architecture could lie anywhere along the line between one end and the other. We are not arguing about individual architectures here. We are here to evaluate the philosophical claims made by the proponents of RISC. So let's go on and look at more of their claims one at a time. We have already established that ARM and others like it do not have reduced instruction sets. So far, that gives RISC a score of 0 out of 1. Why that is should become more apparent as we go on.

The next claim we will look at is that a processor with a large number of registers can run faster than one that has few registers. This idea is true on its own (i.e. taken out of context) because values in registers are closer to the core than those in memory. They are even closer than those in cache. Access times increase as the data needed by instructions becomes farther from the core. Also, registers require much shorter addresses than do memory variables and this also has the benefit of increased execution speed.

Where the RISC people are caught in a lie by implication is that there is nothing that is inherent in CISC that give it a small register set. Indeed, this feature of any computer architecture is totally up to its designer and really has nothing to do with the CISC vs. RISC debate. The only reasons why the RISC proponents even added this item to their list of grievances is that at the time they entered the stage, the most popular architectures had evolved from their 8-bit ancestors and inherited a limited number of registers. So the RISC debate loses this argument as well and their score now becomes 0 out of 2.

Next on our agenda is that Orthogonal Instruction Sets are easier to write compilers for than are Non-orthogonal Instruction Sets. This is a true statement when taken alone, but it adds nothing to the CISC vs. RISC debate. What Orthogonality means is that every instruction uses all of the general purpose registers equally and no general-purpose operations should be tied to any particular register. An example is that the shift and loop counts in the x86 must use the CX register. Here again we have a disgruntled new-comer who claims that the current popular computer architecture is the only way that CISC systems can be designed. As before, the orthogonality of any computer is up to its designer and is not controlled by whether it is a RISC architecture or a CISC one.

Or maybe not in this case. Let's dig a little deeper. On the RISC manifesto nailed to the chapel door is that “Every instruction should be fixed in width”. By their nature, general-purpose processors need instructions that have highly variable needs (which are tied to complexity). This variable complexity implies a natural variability in instruction size. Because the instruction width is fixed in RISC computers, the orthogonality of the forms that instructions can take is severely hampered. The ARM instruction set uses a fixed 32-bit width (unless operating in its reduced THUMB mode in which case instructions have a fixed 16-bit width). By contrast, AMD64 (the 64-bit incarnation of the x86) has no such constraint and their lengths can be anywhere from 1 to 15 bytes in length as needed. So this means that RISC instruction sets can't be as orthogonal as a well-designed CISC instruction set. This gives RISC a score of 0 out of 3 claims.

So let's talk about caches. RISC proponents concede that large caches are needed in order for RISC processors to achieve high performance. But wait, a need is not a benefit. Indeed, CISC processors can benefit from large caches too. However, there is a grain of truth to the concept that, if the part of the chip die that implements logic is smaller, there is room for more cache on the same die. To know how much cache can make up (in speed) for the loss of an increment of logic is not a simple question to answer. But we do know that (theoretically), if your processor is smarter, the sizes of instructions can be lower because it can do more with less when the fixed instruction size constraint is removed.

One thing is clear, the simpler RISC logic block is not able to deal with complex instructions so there is a lot more code wastage when using a fixed instruction width. Because of this, Code Density for CISC processors is better than that of RISC processors. Not only are single RISC instructions less efficient in their code densities (averages as bad as 2 to 1), one RISC instruction can't do the work detail of a well-designed CISC instruction. So you might need 2 or three RISC instructions to complete an action that might only require a single CISC instruction. This bloats an already bloated instruction stream so that RISC typically needs 2 to 6 times as much code thrown at it as that needed by a well-designed CISC processor to accomplish the same task.

“But speed is everything!”. The RISC proponents chant. Let's recall that the CISC vs. RISC debate got going during the 1990's. What has happened since then? Well, I'll tell you. It is the phenomenon of Multi-core and Parallel Processing. As the semiconductor industry started to reach the end of Moore's Law (more like Moore's Trend), processor manufacturers started to add additional cores to their architectures. This is when the RISC processor's need for more instruction bandwidth grew a head with fangs, reached around and bit their promoters in the rear. When the instruction stream only needed to feed one processor, the promoters of RISC believed that their need for more bandwidth was manageable. But when it needed to feed multiple cores then it made their goals unreachable. I can hear the RISC makers saying holy cow! Did anyone see this coming? If any of the RISC people did see it coming, they kept quiet about it because it was going to be very damaging to their cause.

As I said before, CISC processors can have large caches as well. And when seeing how large some of the primary, secondary and tertiary caches are on most of the high-end AMD64 processors, it doesn't take much to understand that a lot more smarts could be put into a CISC by giving up just a little of that cache. Two to four CISC processors can be kept busy on an instruction stream that feeds only one RISC processor. The CISC promoters don't have the market cornered on cache memory. So what all this means is that a special-purpose RISC core can run faster than a general-purpose CISC core given a special-purpose application. But it also means that a whole slew of CISC cores will run circles around a single RISC core given the same general-purpose application and a memory system with a fixed instruction bandwidth. Add a slew of RISC processors and they will just asphyxiate on the lack of bandwidth. So we will give RISC a score of 0 out of 5 claims.

One principle that everyone seems to overlook is that systems can run faster when they use local decision-making. It makes me think about military blunders made during WWII. When the Navy had to send their decrypted messages to Washington for review, often times a time-critical decision wouldn't be made until long after it was too late to do anything about it. The same thing applies to CPU function. If the instruction contains more smarts (CISC processors), the processor can keep things going faster because it won't have to go out and fetch three more instructions to know what ultimately needs to be done.

Well, at least ARM had people snowed long enough to get more than a toe-hold in market share. I'm happy for them. I will give them one point for being right about one thing (1 out of 6). That is that RISC can get you to market faster; especially if you are able to use propaganda to give prospective customers a false sense of the reality.

Tim Healy

Producer/Director at Megastar Productions

5 年

Interesting stuff. Well said Paul.

要查看或添加评论，请登录

Paul McKneely的更多文章

Time/Space Trade-off in Computer Performance

2020年3月31日

Time/Space Trade-off in Computer Performance

It's interesting how the individuals making up an entire industry can set their minds to competing with one another to…
Is 0 > 1? (or even 9?)

2020年3月20日

Is 0 > 1? (or even 9?)

I decided a long time ago that most people in our “modern” society don't fully understand or accept the concept of…
Local Speed Principle

2020年3月12日

Local Speed Principle

There is no doubt that RISC processors are easier to design than CISC and that they minimize the number of gates…
An Interesting Comparison between ARM (Thumb-2) (w/ C) and ?CPU (w/ ?PPL)

2020年2月20日

An Interesting Comparison between ARM (Thumb-2) (w/ C) and ?CPU (w/ ?PPL)

I have long been interested in computer logic and code generation. There has never been a race between CISC and RISC…
Stack Packing (and other ruminations)

2019年7月11日

Stack Packing (and other ruminations)

High Level Languages (HLLs) have done a lot of damage to our μ-processors and computer technology in general. This…
An Architectural Comparison

2019年6月15日

An Architectural Comparison

It was about 2005 when the x86 got a face lift and became AMD64. This saved us all from the looming threat of…
Lessons Learned in CPU Design

2019年6月2日

Lessons Learned in CPU Design

I feel privileged to have lived during the heyday of the μ-processor revolution. From the 8-bitters of the 1970's…
A New ?PPL SD Tool Chain Back-End

2019年4月28日

A New ?PPL SD Tool Chain Back-End

https://www.youtube.
What Most People don't know about ASCII (The good and bad about Unicode)

2019年4月6日

What Most People don't know about ASCII (The good and bad about Unicode)

It is interesting to think about how technology doesn't always improve through time. In fact, technology has evolved in…
Will we always be stuck with Software Training Wheels?

2019年2月20日

Will we always be stuck with Software Training Wheels?

I have a cyber-colleague who I met virtually (through email) soon after I first gained access to the Internet in 1995…

See all articles

The Hypocrisy and Hollow Promises of RISC

Paul McKneely

President, technoventure, inc.

Paul McKneely的更多文章

社区洞察

其他会员也浏览了

ENIAC Day

T 1072/11: Matching unit comprising two computer entities directly connected to a shared memory storing pre-calculated values/ NASDAQ

Why Are RDDs Immutable?

Euclid, Government, and Privacy

Extreme-scale Scientific Software Stack (E4S)

SAMOS 2024 Call For Papers: The International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation

Class Session

Fallen Giants – by James Tweed.

Edge Computing with Raspberry Pi

The law of "bummer effect conservation" and edge computing

Paul McKneely的更多文章

Time/Space Trade-off in Computer Performance

Is 0 > 1? (or even 9?)

Local Speed Principle

An Interesting Comparison between ARM (Thumb-2) (w/ C) and ?CPU (w/ ?PPL)

Stack Packing (and other ruminations)

An Architectural Comparison

Lessons Learned in CPU Design

A New ?PPL SD Tool Chain Back-End

What Most People don't know about ASCII (The good and bad about Unicode)

Will we always be stuck with Software Training Wheels?

社区洞察

其他会员也浏览了

ENIAC Day

T 1072/11: Matching unit comprising two computer entities directly connected to a shared memory storing pre-calculated values/ NASDAQ

Why Are RDDs Immutable?

Euclid, Government, and Privacy

Extreme-scale Scientific Software Stack (E4S)

SAMOS 2024 Call For Papers: The International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation

Class Session

Fallen Giants – by James Tweed.

Edge Computing with Raspberry Pi

The law of "bummer effect conservation" and edge computing