登录查看更多内容

What is wrong with this (system)verilog code snippet?

Bert Verrycken

ASIC | hwaccelerators | let's connect

发布日期: 2024年9月29日

A stupid post today about some code from some open source repo seemed to have gathered some interest. Since I am known for never shutting up, I want to come back the the code itself and tell you what I think. Which can be interpreted any way you want, you can even disagree with it.

This the code

I have classified my unnecessary and uncalled for criticism in three categories:

Nitpicking, only done by really bad characters.
Close but no cigar.
Real problems with the circuit.

And I could have missed things, because I'm a hallucinating human, not a large language model.

Nitpicking

So, you lined out your input ports with lots of spaces, except sum_o. What is that about? When I first started pressing keys on a keyboard and producing HDL code, I did not like those alignments. And I truly hate this ASCII art bloating. But over the years, I have developed some kind of appreciation for aligning for example the signal declarations and the port definitions. But with parameters and complex types, it becomes a nuisance. There are ways around this, to get the typedefs from an external file which I prefer over typing the same "signed [DATA_WIDTH_SUM-1:0]", also because one typedef is less error prone than typing three times the same parameter - 1 and forgetting the minus one for example. You see, I wasn't joking when I said I was nitpicking.

I also like my adders to come with a parameter that allows me to build larger math. And then you want to have control over the "is the output registered or not?". Because I'm totally insane, I have created a package that defines a class with "adders" so I can parameterize those adders (via a class) and use this to build more complex math blocks. You know, in AI everything is math and you need wider arithmetic in some places, and in some places you need very small ones. Some I want to combine in one clock cycle and some I need to register so the worst case path is contained.

And last nit but not least, the annoying habit of "_i" and "_o" or "i_" and "o_". I'm sorry, but the bad digital designers are lazy. meaning they don't want to keep learning every day. The good digital designers are lazy too. But in the sense that they want to automate things. Connecting adders in an ALU for example, is tedious, error prone and boring work. So you like to script the connections, so my sum_o needs to connect to another module's sum_i. Sure, you can add that to the scripting, but I said we are lazy so I'd rather not. There is a much simpler convention that can always immediately tell you if a port is an input or and output and from which module it comes. If you prefix the outputs with a three letter capital letter prefix for the module it comes from. Obviously, that can get complicated too, but it brings me more info than the above coding style. If I have a module where I am using ADD_sum, I know this is an output for the ADDER and in the adder itself I have to create the output, in other modules I them as input from the adder.

It seems the nitpick thing went out of control fast, three paragraphs fgs.

Close but no cigar

Parameters are a good thing. But they can be a bit of a PITA as well. Because you have to satisfy two conflicting commandments:

Thou shall always use a descriptive name for a signal, function, task or parameter. people working with a, b and c for complex calculations for example, should be promoted to management quickly so their influence on the code base will be minimal.
Thou shall use very brief names for parameters because the port list gets out of hand quickly. This is because most people do not use typedefs and packages. Sure, there are tips and tricks there to make life easier as well but this falls out of the scope of this rant.

领英推荐

Breaking the Linear Mold: A Brief Dive Into…

Xencia Technology Solutions 1 年前

openEuler SIGs News & Updates: Charting the Path…

openEuler 2 个月前

Sparkworks - Native Hardware Acceleration in Arkworks

Ingonyama 1 年前

I will come back to my package with class adder functions (methods if you prefer that). See, I like to deliver a module finished, that means design + sanity verification + synthesis/STA/DFT clean. If I have a module, I will write an sdc script and some verification for the first parameter used (or maybe multiple) but not for all. For me the complexity of an adder or a counter for that matter is not enough to qualify as a module. I don't want 50.000 modules in my design. At the same time, I want to have reuse and I'm lazy so parameterized reuse. That is why I like the package approach. The package is not intended to be synthesized. It needs to be verified and then used in modules where it gets synthesized.

Conclusion? Good for the parameter use but if we look at it in levels:

An adder module with parameters is the baseline.
An adder module with parameter is level 1.
An adder FUNCTION with parameter is level 2.

Again a very subjective matter, but to me it makes sense to think in LeGO terms while not blowing up the number of different LEGO pieces in the process.

The Real Deal with Bill McNeal

I am always trying to put little thingies that nobody understands in my writing. For my own pleasure, so forgive me for naming this chapter like that. There are real problems with this piece of code. Because if you add two 4-bit wide numbers, you don't get a 4-bit number as the sum. No, you get a 5-bit number. Or 4-bits plus carry. And that is one of the plus point sof VHDL, and why people hate it, because it will not compile if you sum two 4-bit numbers and assign it to a 4-bit output sum. In verilog you can, but then when somebody runs the lint tool or runs synthesis much later in the process, they flag this issue and the code needs to change. Because for an adder and a counter, assuming we are counting up, you need to tell what happens to the counter when it reaches the maximum value and what to do when it overflows. If in our example of the 4-bit adder, you ignore the carry bit, you have implemented a wrap around adder without any means of detecting it overflowed. For counters, you have counters that increment and saturate or increment and wrap around. Whatever you do, you need to make it clear what you want to do with the 5th bit. And rightfully so. Again, coming back to my adder package, you have a function called incsat and one called incwrap. Using those function calls, immediately makes it clear what is intended.

The reset comes last, while in digital hardware, it actually comes first! While every HDL has been designed to abstract the underlying hardware, there are fundamental differences between FPGA and ASIC designs:

An ASIC needs an asynchronous reset. To enter the reset and immediately put all IOs in a safe state (especially in case of a clock failure), we need an asynchronous enter. To make sure all flops in the same clock/reset domain all come out of reset at the same time (even spread over an area with skew and OCV), we need to exit reset synchronously.
A FPGA is an ASIC but with configurable digital logic. Designs for FPGA need a synchronous reset because of the way the reset for flops is implemented inside the device. More importantly, because the FPGA has configurable logic, there is overhead in routing and logical blocks (groupings of boolean logic). They have a speed disadvantage versus an ASIC with the same HDL design (if we compare in the same tech node). And to have maximum performance, every flop that doesn't need a reset, should not have a reset. Because the same circuit with reset will be slower than without. So, state and configuration require a reset value, but data that gets overwritten (for example the sum in an adder) every clock will be able to run faster without a reset.

Often ASIC is prototyped in FPGA so this poses a challenge in speed of the clock for the FPGA. You don't want to change the code for resets specifically for prototyping. You want the code to be close to the final ASIC code that gets taped out. But you ahve to run the FPGA logic at considerable lower clock frequency than the ASIC. And your hand coded modules can handle that probably, but then you have external communication protocols like DDR, PCIe, ethernet, ... where the controller and PHY are hard macro's and you need to run at speed because also your host (a PC you buy over the counter) can not just turn down its clock with a factor ten.

Blocking Conclusion

There are surprisingly a lot of things to say about a few lines of code, even about something as simple as an adder in systemverilog. Some my find it useful, some might not, which is all fine. Writing is my therapy, not yours :-). A last thing I didn't mention is commenting your code. Comments are valuable to understand later what the code does. You might have forgotten or someone else needs to understand your code. Commenting is a necessary skill which has been a contentious issue for decades. Now, an adder doesn't need much commenting. But if you don't use the addsat or addinc function calls, you actually haven't specified what kind of adder you implemented. Without comment about the intended functionality, we have no way of knowing if the extra carry out bit that gets ignored was important or not. So, comment your code but keep it relevant. The only comment in the code in the screenshot is about keeping the output, an unnecessary else statement because non-blocking sequential logic in verilog keeps its value if not assigned in a clock cycle anyway. Originally, based on how the event handler worked decades ago in verilog, we had a mandatory two-process coding style, one for the combinational part and one for the sequential part. The "_nxt" and "_ff" coding style, iykyk. I'm not sure it is still needed, maybe someone has studied how todays tools handle that, but this one always style looks like it merged both styles without knowing too much about blocking and non-blocking statements in systemverilog. So, the only commented line is actually a redundant line and should be deleted. In a pure combinational always block, it would make sense because there you have to cover all if-then-else conditions with an assignment.But this isn't the case here.

comments

HW accelerators eating AI

2,727 位关注者

Jose T. de Sousa

5 个月

Indeed very subjective... Objectivity is passing the customer criteria.

1 次回应

Kevin Cameron

5 个月

The problem with this is really that you are writing RTL, which is like writing parallel assembly code. You really want to be at the level above doing asynchronous descriptions (no clock spec) where you send it a pair of numbers, and it sends you the sum back. I tried getting support for that into SV over a decade ago, and they wouldn't do it, so I built a C++ prototype solution instead. https://parallel.cc

1 次回应

Luigi C. Filho

5 个月

I totally agree with your point, I think people give too much importance to naming conventions (not that is not) rather the electronics theory itself

2 次回应

查看更多评论

要查看或添加评论，请登录

Bert Verrycken的更多文章

Life is full of surprises: Rammstein meets Meta meets Gauss

2024年11月24日

Life is full of surprises: Rammstein meets Meta meets Gauss

An actual selfie I just took: Grouchy Smurf I'm deep into my unpopular take period now. People wonder why I write, well…
I ?? to create an AI digital IC engineer (2/n)

2024年11月16日

I ?? to create an AI digital IC engineer (2/n)

As a child, I was very interested in Battlestar Galactica. Later it became clear the special effects were amateuristic,…

2 条评论
I ?? to create an AI digital IC engineer (1/n)

2024年10月5日

I ?? to create an AI digital IC engineer (1/n)

Link to AIVRAL paper on arxiv: https://arxiv.org/pdf/2409.

8 条评论
Open to Work

2024年7月29日

Open to Work

I have more than 26 years of industry experience and participated in more than 20 product tape-outs. Over the years, I…

6 条评论
Arthritis Book: The Road Back

2022年10月9日

Arthritis Book: The Road Back

This is not medical advice, I am not a doctor. I am not affiliated (had to buy the book second hand, it is old).

2 条评论
Verification interviews: the mood, the mad and the mugly!

2021年2月16日

Verification interviews: the mood, the mad and the mugly!

Some random thoughts on interviews and interviewers. I’m still on both sides of the coin, after over two decades of…
Analog compute, AI HW Summit 2020

2020年10月9日

Analog compute, AI HW Summit 2020

I am digital, so, it is unusual to promote analog, but for Mythic I have to make an exception. Credit images/video:…

1 条评论
Nvidia enters the SERVER arena

2020年9月14日

Nvidia enters the SERVER arena

Yes, I am aware ARM licenses its processor cores and doesn’t produce them. As you know Nvidia is a GPU producer mainly,…

7 条评论
WW2 and COVID-19: it's science?!?

2020年9月13日

WW2 and COVID-19: it's science?!?

There are many articles written about his specific analysis, I just googled one: "During World War II, the Navy…

6 条评论
Dark Social Media v6.6.6

2020年8月29日

Dark Social Media v6.6.6

Quora is full of marketing Bonobo's, I get that. And I do learn a trick or two from those primates from time to time.

1 条评论

See all articles

What is wrong with this (system)verilog code snippet?

Bert Verrycken

ASIC | hwaccelerators | let's connect

This the code

Nitpicking

Close but no cigar

领英推荐

The Real Deal with Bill McNeal

Blocking Conclusion

HW accelerators eating AI

2,727 位关注者

Bert Verrycken的更多文章

社区洞察

其他会员也浏览了

Functional Mock-up Units (FMUs) in Simulation Models ??

The Byte Whisperers: Decoding the Secret Language of Software

Debugging Consciousness: A System Architect's Perspective

Ghost in the Machine

Deterministic Finite Automata

Idioms for Polymorphism and Templates

Dynamic and Static Polymorphism

Self-Healing Memory System

std::execution: Asynchronous Algorithms

Bend - A Rust-Backed Language That Supercharges Parallel Computing Like Never Before

This the code

Nitpicking

Close but no cigar

领英推荐

The Real Deal with Bill McNeal

Blocking Conclusion

HW accelerators eating AI

2,727 位关注者

Bert Verrycken的更多文章

Life is full of surprises: Rammstein meets Meta meets Gauss

I ?? to create an AI digital IC engineer (2/n)

I ?? to create an AI digital IC engineer (1/n)

Open to Work

Arthritis Book: The Road Back

Verification interviews: the mood, the mad and the mugly!

Analog compute, AI HW Summit 2020

Nvidia enters the SERVER arena

WW2 and COVID-19: it's science?!?

Dark Social Media v6.6.6

社区洞察

其他会员也浏览了

Functional Mock-up Units (FMUs) in Simulation Models ??

The Byte Whisperers: Decoding the Secret Language of Software

Debugging Consciousness: A System Architect's Perspective

Ghost in the Machine

Deterministic Finite Automata

Idioms for Polymorphism and Templates

Dynamic and Static Polymorphism

Self-Healing Memory System

std::execution: Asynchronous Algorithms

Bend - A Rust-Backed Language That Supercharges Parallel Computing Like Never Before