登录查看更多内容

Why learn Computer Architecture? A Case-study approach.

Neeraj Kumar Cheryala

Engineer? at Qualcomm

发布日期: 2020年10月24日

Computers have truly transformed our lives over the last three decades. They are now an integral part of our lives. Moore's law has fortunately remained true with emerging VLSI technologies cramming more and more components onto silicon chips. Besides, there have been many advancements in the area of Computer Architecture too with the multi-core processors and GPU computing. With each paradigm shift in the computers industry comes the cost of power constraints, reliability issues and complexity of designs etc.

Students belonging to the fields of Computer Science and Electronics learn 'Computer Architecture' at some point of their graduation program but have you questioned yourselves anytime about the exact need to learn such a technology..? The computing landscape is very different from 10-20 years ago. Industry in fact seems to be in a large paradigm shift with many novel system designs possible. Hence, today is actually a very exciting time to study computer architecture. This article takes a case study approach to discuss why learning Computer architecture is essential. Below is a figure that gives an understanding of where this technology stands in the computing stack and what are the implications.

Where does computer architecture lie in the computing stack..?

Abstraction and Levels of Transformation

With abstraction, a higher-level entity only needs to know about the interface to lower-level bu not how the lower level is implemented. For instance, a programmer does not necessarily need to know how a computer executes the instructions what is the particular ISA (Instruction Set Architecture) underneath. This abstraction actually arises through what we call as "Levels of Transformation" i.e. levels present in the computing stack. These are depicted by the below diagram. It indicates how a real-world computing problem essentially solved by moving electrons.

Figure: Levels of Transformation (The computing stack)

Abstraction seems helpful for a good reason. It increases the productivity of programmers towards the top of the stack. They need not know about the decisions made at lower levels. But this is good as long as everything goes well. For a moment, I want you to think what if the program you wrote is slowing down even though you have chosen best semantics and algorithm..? or what if the program you wrote is consuming a lot of energy..? What if someone just compromised your system and you have no idea how..? This is exactly why you need to know what happens underneath at the ISA and microarchitecture level. One such scenario is 'Memory Performance Hog' which is discussed in the next section.

Multi-Core Systems and Memory Performance Hogs

Memory performance hog is a relevant example because it's been detected on a processor running multiple cores which is the present-day technology. We know that for a multi-core system, there is a separate L2 cache for each core but a single DRAM controller acting as an interface to all the DRAM banks. Multi-core systems are simpler and consume lower power when compared to a single large core. On the contrary, according to the paper from Usenix Security, "Memory Performance Attacks: Denial of memory service in multi-core systems." Unexpected slowdowns have been discovered when MATLAB and GCC were running simultaneously on different cores. What actually happened is MATLAB doesn't nearly slow down but GCC slows down by three times of it running solely. Now a straight question from my side - Can you figure out why the application has slowed down if you don't know how the underlying system works? or Can you fix the problem without knowing what's happening underneath..? It seems like GCC has been given low priority when compared to MATLAB which shouldn't actually happen. Let's now dig deeper into the stack and understand the situation.

Figure: A typical multi-core system with DRAM banks.

The reason for the disparity in the slowdowns is found to be the unfair priority to MATLAB over GCC by the DRAM memory controller since it is common to all the cores. The DRAM controller algorithm is not well prepared for this scenario. This actually happens because of the row buffer in DRAM banks. Whenever there's an access to a row of a DRAM bank, the entire row is first placed on the row buffer. So whenever there is another access to the same row, the data is easily sent out from the row buffer. This is called as a row hit and similar to a cache mechanism. For a single-core system, this increases the overall throughput of the DRAM system. The scheduling policy used here is referred to as FR-FCFS which services row-hit accesses first followed by older accesses. Hence, DRAM controller unfairly prioritizes applications with high row buffer locality i.e. processes that keep accessing the same row similar to the case of MATLAB. MATLAB has got sequential memory access, very high row buffer locality and is memory intensive. On the other side, GCC has random memory access with very low row buffer locality. So for a row size of 8KB and request (MATLAB) size of 64B, 128 requests of MATLAB are serviced before a single request of GCC.

Figure: Memory performance hog as discussed in the article.

Now that you have got the understanding, you should be able to suggest a solution. Firstly, which part of the computing stack has the solution..? Is it the system software..? Compiler..? or the memory controller..? I want you to think critically and comment on the solution to the problem. As a hint, the solution lies with the part common to all the cores. To wrap up, new problems always arise with shifts in paradigm which affect all parts of the computing stack. Thinking critically and broadly is essentially to solve those problems along with knowing the fundamentals and principles very well. Understanding this may enable you to rightly exploit advances and changes in underlying technologies. On a whole, you can even revolutionize the way computers are built if you understand both the hardware and software. So, that should be enough for this article and I hope you have a nice takeaway from this. Stay tuned for more articles on electronics and computer networks. Have fun!

Neeraj Kumar Cheryala

Engineer? at Qualcomm

4 年

The solution proposed by the very authors of the paper that identified the problem is STFM (Stall Time Fair Memory Access) scheduling policy. For more information, check out: https://www.google.com/url?sa=t&source=web&rct=j&url=https://people.inf.ethz.ch/omutlu/pub/stfm_micro07.pdf&ved=2ahUKEwiMgs-v2M7sAhVlHzQIHQmEBb0QFjABegQIAhAB&usg=AOvVaw1q1642-ozVtvHYKYgk6sqj

要查看或添加评论，请登录

Neeraj Kumar Cheryala的更多文章

ChatGPT Answers: Major Milestones in IoT Industry

2023年4月22日

ChatGPT Answers: Major Milestones in IoT Industry

ChatGPT is an advanced AI language model that is currently making waves in the tech industry. Built on the GPT-3.

1 条评论
A Survey of Computing Paradigms - From Literature to Machines

2022年12月12日

A Survey of Computing Paradigms - From Literature to Machines

It has been proved time and again that designing better #computing systems for the future is only possible by…
Circuit Verification

2021年3月15日

Circuit Verification

The last two articles of this series, 'A Primer on Timing and Verification in Digital Circuits' should have provided…

3 条评论
Timing in Sequential Circuits

2021年2月28日

Timing in Sequential Circuits

In the previous article (https://www.linkedin.

4 条评论
Timing in Combinational Circuits

2021年2月23日

Timing in Combinational Circuits

Circuit design is a trade-off between area (Circuit area is proportional to the cost of the device) and speed /…
Systolic Arrays and the TPU

2020年10月30日

Systolic Arrays and the TPU

Computers have truly transformed our lives over the last three decades. They are now an integral part of our lives.
SSH and The Power of Public Key Cryptography

2020年7月22日

SSH and The Power of Public Key Cryptography

Basics of Secure Shell (SSH) SSH is essentially a network protocol that leverages cryptography. As you might know, it…
Evolution of Computers

2020年7月12日

Evolution of Computers

This article will present you an intriguing journey in the world of computers and let you navigate through the history…

1 条评论
Decoding the benefits and the downfall of VLAN TRUNKING PROTOCOL (VTP)

2020年7月12日

Decoding the benefits and the downfall of VLAN TRUNKING PROTOCOL (VTP)

In Computer Networks, a VLAN creates a logical broadcast domain across multiple sections of a LAN. VLANs improve the…

2 条评论

See all articles

Why learn Computer Architecture? A Case-study approach.

Neeraj Kumar Cheryala

Engineer? at Qualcomm

Abstraction and Levels of Transformation

Multi-Core Systems and Memory Performance Hogs

Neeraj Kumar Cheryala的更多文章

社区洞察

其他会员也浏览了

Exploring the Future of Computing: Paul Savluc's & OpenQQuantify's Lab Release on Simulating Classical, Quantum, and Hardware Processes

Algorithms - Thursday Series - Edition 1

Merge Sort And It's Early History

Beginner's Guide: A Computer Science Cheat Sheet to Kickstart Your Tech Journey

Constructor Theory: Redefining Knowledge and Innovation in Software Development

From NAND to Tetris…

How to learn more about quantum computing

The Shifting Landscape of the Semiconductor Industry: ARM’s Position and Intel’s Future

FLASH: A Novel Approach to Adaptive Hyperdimensional Computing for Efficient Function Calling

To harness benefits of parallel processing

Abstraction and Levels of Transformation

Multi-Core Systems and Memory Performance Hogs

Neeraj Kumar Cheryala的更多文章

ChatGPT Answers: Major Milestones in IoT Industry

A Survey of Computing Paradigms - From Literature to Machines

Circuit Verification

Timing in Sequential Circuits

Timing in Combinational Circuits

Systolic Arrays and the TPU

SSH and The Power of Public Key Cryptography

Evolution of Computers

Decoding the benefits and the downfall of VLAN TRUNKING PROTOCOL (VTP)

社区洞察

其他会员也浏览了

Exploring the Future of Computing: Paul Savluc's & OpenQQuantify's Lab Release on Simulating Classical, Quantum, and Hardware Processes

Algorithms - Thursday Series - Edition 1

Merge Sort And It's Early History

Beginner's Guide: A Computer Science Cheat Sheet to Kickstart Your Tech Journey

Constructor Theory: Redefining Knowledge and Innovation in Software Development

From NAND to Tetris…

How to learn more about quantum computing

The Shifting Landscape of the Semiconductor Industry: ARM’s Position and Intel’s Future

FLASH: A Novel Approach to Adaptive Hyperdimensional Computing for Efficient Function Calling

To harness benefits of parallel processing