(B)ASIC: how do we deal with multiple clocks?
One of the major differences between software and hardware is the concept of a clock. It defines the rhythm the sequential elements update inside the chip. Note that in complex System-On-Chips (SoC) there are many clock frequencies that distribute to parts of the silicon. In this post, we will take a basic look at a clock. At the bottom of this post, you will find links to the previous posts. The “ASIC FUNDAMENTALS” series of blog posts starts with “synchronous design” and the rest follows chronologically after that first post. In this post, we talk about multiple clocks in a design. We try to answer the question: how do we get from one domain to the other?
Are two clocks (a)synchronous?
In a previous post, we saw the KISS principle in action. Keep it simple and stupid. Specifically, to simplify, we use the period of a square wave (clock) to constrain the path through the logic. Every path starts at the output of one sequential element and ends at the input of another sequential element. Some paths contain little combinatorial logic and the delay is therefor small compared to the clock constraint. In principle, we could add more logic to those paths to get results in one clock instead of multiple clock cycles. Maximizing the processing we do in one clock because we have the room to do so. Others, critical paths, need quite some effort to keep the logic delay constrained to the clock period. Hence, we care only about critical paths, if any. Whenever the design meets timing, we do not optimize further; we consider the timing closed for that block.
But what if we need the positive edge of the clock and the negative edge in our design? A sequential element triggers only on one active edge. Thus we have a mix of flip-flops, some clocked on the rising edge, some clocked on the falling edge. At first sight, it looks like the same clock with just a 180 degree phase delay, right? Unfortunately, the phase delay requires an inverter. Every sequential element that connects to the same clock wire with no logic between the clock and the input clock pin belongs to the same clock domain. Sequential elements in the same clock domain are synchronous to each other. If we need an inverter for the negative edge clock, it is essentially asynchronous to the original. The gate causes a delay for the edges versus the edges of the original clock. We call that delay “clock skew”. By default, we consider all clocks asynchronous. It requires an explicit definition to make them synchronous. Important, the front-end team and the back-end team both need this information. The back-end team will need to compensate the skew between rising edge clock domain and negative edge clock domain.
Ok, fine, understood. But what if the clock frequency is different? Let’s say we consider an SPI slave that has a data out, data in, a clock (input!) and a chip select. BTW, Master Out Slave In (MOSI = data from master to slave) and Master In Slave Out (MISO = slave data to master) are specific SPI terms. Assume we have an on-board processor. It needs to send and receive the data in a much faster system clock domain. Now, recall the setup and hold time of a flip-flop. In this system, both clocks are asynchronous. The system clock is fast and does not have the same source clock as the SPI. The SPI clock is coming from an external master and is a slower clock. If we copy a data byte from one clock domain to the other, there is a potential issue. Imagine the active edge of the SPI clock coinciding with the active fast clock edge. A bit in the SPI clock domain that didn’t change value is static. It transfers to the fast clock domain as expected. But other bits in the SPI domain that change state, are potentially somewhere between the low and high threshold. The fast clock reflects either the old bit value or the new one. Hence the copy of the byte contains a mix of old and new bit values. There are intermittent errors when copying the data. Therefor, we never synchronize bit vectors (multiple bits), we use a handshake mechanism. The slow clock domain signals that data is ready with a one-bit flag. The fast clock-domain acknowledges with a one-bit flag. Both flags have a dual synchronizer to the other clock domain. Why two sequential elements to synchronize? Well, we must check the metastable situation that the first element potentially encounters. The characterization of the technology cell (flip-flop) defines this transition time. This time must be lower than the clock period we use. If that condition is true, then a dual synchronizer on a single bit is always safe. The second synchronizer always samples a valid one or zero. By construction, a metastable state for a zero to one transition sees a zero or a one. There is a maximum of one synchronizer clock period delay in seeing the state change. This is true as well for the one to zero change. A request-acknowledge handshake requires storage of data in the SPI domain to make sure the value is stable. After the acknowledgment, we allow the data to change.
Pro-tip 1:
If you have a naming convention for synchronizer flip-flops (consistently and unique to synchronizers), a simple "grep command” of the HDL source reveals all synchronizers in your design. This is important for various reasons.
Pro-tip 2:
If a clock is an integer multiple of the same source clock, it is possible to make both synchronous. The back-end team makes those rising edges synchronous. If the source clock is different, there always is a “non-zero ppm” difference in clock frequency. And the edges will slowly drift apart.
What did we learn in this post?
In this post, we learned about asynchronous clock domains. Every clock in the design is asynchronous by default. Unless we explicitly declare them synchronous. That becomes a requirement for the back-end team to make them synchronous. It always comes with a cost. And clocks that do not share the same source clock, independent frequencies, are impossible to make them synchronous.
Applause for ourselves!!! ????????
(support what you like) Pet my ego with a click on the like button! ??
#semiconductors #asic #fpga #technology #VHDL #verilog #systemverilog
Prerequisites to this post:
(1) 1st post: What is synchronous design?
(2) 2nd post: How do we design an ASIC?
(3) 3rd post: What’s a clock?
(4) 4rd post: What are the important clock parameters?
(5) 5th post: Do we choose an internal or external clock?
Comment with what you want to find out next about ASIC or FPGA design.
If you have multiple clocks and want to check for CDC (clock domain crossing) errors you can use this modeling technique - https://www.v-ms.com/ICCAD-2014.pdf All the other approaches to CDC are static analysis based, that one will work with UVM and Verilog-AMS.