Reliable CDC constraints #2: Counters and FIFOs

Reliable CDC constraints #2: Counters and FIFOs

Passing multi-bit counter words between clock domains is the foundation of asynchronous FIFOs. The well-known and most efficient way to do this is with Gray-coded values. This article will show, however, that this can fail spectacularly in at least three different ways unless we know what we are doing and use proper constraints.

This article will not explain how to write an asynchronous FIFO, nor the details about Gray code. There are plenty of articles about that, and Google is your friend. Instead, we will focus on the limitations of the design, how it can fail, and what timing constraints are needed for reliable operation.

A clock domain crossing (CDC) for a single-step counter value is usually done like this:

This is partially a similar structure to the single-bit level CDC discussed in article #1 but for parallel bits. Using Gray code ensures that only one parallel bit changes value each time the counter increments or decrements. If clocks are unrelated, we will sometimes sample a value in domain 1 that is transitioning. But since we use Gray code, this can only happen for one bit at a time. Whether metastability recovery results in the old value or the new value does not matter, the result is still valid.

Or that's the idea at least. Let's think about how this design can fail.

Error mode #1: Counter jumps

This first failure condition comes directly from the text above: This CDC works only if the counter is continuous. Meaning that each domain 0 clock cycle, it may only increment by one, decrement by one, or keep its old value. Say instead that the counter value in domain 0 jumps from 3 to 5, as an example:

  • 3 (decimal) = 011 (binary) = 010 (Gray)
  • 5 (decimal) = 101 (binary) = 111 (Gray)

As the Gray value in domain 0 is transitioning from 010 to 111, we might sample a metastable value in domain 1 that settles on 011, which equals decimal 2. The counter in domain 0 never had this value, and the CDC has obviously failed.

Solution

This limitation is intrinsic and can not be remedied without changing the topology. Assertions can be placed in RTL so that any misuse is caught in simulation. Other than that, the user needs to know the limitations of this CDC topology and only instantiate it when appropriate.

A counter value that is not continuous should be seen as a general data word with correlated bits. We will explore CDC topologies suitable for words like these in future articles.

Error mode #2: Glitches

Say that we want to be "smart" and save some flip-flops (FF). Those FFs in domain 0 seem redundant, let's remove those. Timing still passes so I seem to have margin. Right??

This is similar to article #1 error mode #5, see there for more background, and it will not work. The combinatorial Binary-to-Gray calculation is implemented as N number of 2-to-1 LUTs. Multiple LUT inputs can change each clock cycle, meaning the LUT output will have glitches and transients. Without the FF in domain 0, glitches can be sampled in domain 1, resulting in completely faulty values.

Solution

We have to sample the LUT output with an FF in domain 0 so that LUT glitches are eliminated. Or, in other words, the "async_reg" FF chain in domain 1 must be driven by an FF in domain 0. Same as for the single-bit level CDC.

Error mode #3: Intra-counter skew

Without any timing constraints on the CDC, the implementation tool might route a three-bit counter CDC like this:

Without constraints, the routing delays are undefined, possibly infinite, and vary between builds. To illustrate how a routing delay difference results in failure, assume an example counter that is stable at 3, and then transitions quickly through 4 and 5:

  • 3 (decimal) = 011 (binary) = 010 (Gray)
  • 4 (decimal) = 100 (binary) = 110 (Gray)
  • 5 (decimal) = 101 (binary) = 111 (Gray)

Say that the routing of the MSB is 1.5 clock 1 cycles slower than the others, as shown in the picture above. The result value will go through the sequence:

  • 010 (Gray) = 011 (binary) = 3 (decimal)
  • 010 (Gray) = 011 (binary) = 3 (decimal)
  • 011 (Gray) = 010 (binary) = 2 (decimal)
  • 111 (Gray) = 101 (binary) = 5 (decimal)

The input counter never had the value of 2, meaning our CDC has failed.

Solution

A timing constraint must be applied to limit the routing skew between the bits. See the Timing Constraints section below.

Error mode #4: Latency

Similar to error mode #3 above and article #1 error mode #3, we must control the latency of the CDC. Otherwise, if left to the routing engine, it is undefined, possibly infinite, and varies between builds.

In an asynchronous FIFO, a counter CDC like this is used to CDC the write pointer to the read domain, where it is used to calculate read_valid. If the latency of our counter CDC is high, the latency of data flow through our system will be high, which is bad in almost all conceivable applications.

Solution

A timing constraint must be applied to limit the latency. See the Timing Constraints section below.

Timing constraints

The discussion and analysis above came up with a few things that must be ensured in our code to achieve stable operation:

  1. Set the "async_reg" attribute, so the FF chain maximizes metastability recovery.
  2. Set an upper bound for latency.
  3. Set an upper bound for skew.

Setting the attribute is done in RTL. Controlling the latency is done with the TCL timing constraint ”set_max_delay”. Controlling the skew is done using the TCL timing constraint "set_bus_skew". The upper bound for skew should be the period of clock 0. This ensures that words sampled in domain 1 always assume valid values since in each clock 0 period, a maximum of one bit can change value.

My RTL implementation of this CDC is available on GitHub, with rendered documentation available on the website. This CDC is used in the asynchronous FIFO implementation (GitHub, website).

The corresponding constraint file is available on GitHub. This file must be added in Vivado with the command "read_xdc -ref resync_counter <path>/resync_counter.tcl". If you use tsfpga for build automation, the constraint file is added automagically to your Vivado project.

Utilizing this scoped constraint file means that the “resync_counter” and "asynchronous_fifo" entities can be instantiated freely anywhere in your design as reusable black boxes, and the user never has to write a line of TCL or set any attributes.

In summary

The discussion above shows that there are quite a few things we need to keep in mind to make this design reliable.

  1. Use Gray code, so that only one bit per clock cycle changes value.
  2. Use at least two FFs in an "async_reg" chain to maximize metastability recovery.
  3. Make sure the FF chain is driven by an FF in domain 0.
  4. Never increment/decrement the input counter by more than 1 each clock cycle. Assert that this never happens.
  5. Set an upper bound for intra-counter skew.
  6. Set an upper bound for latency.

All these are solved by using my provided designs, linked above.

Other topologies

I will write further articles about CDC and other FPGA topics in the future. Please "Follow" or "Connect" with me here on Linkedin so you don't miss it. You can also check out the other articles in this series:

Sai Vaibhav Batchu

Cirrus Logic | Masters at UT Austin | Texas Instruments India

6 个月

why does the website mention "Note that unlike e.g.?resync_level.vhd, it is safe to drive the input of this entity with LUTs as well as FFs." in the resync_counter.vhd section

Oliver Bründler

FPGA/Embedded System Architect | Maintainer of Open Logic HDL Standard Library | FPGA Lecturer

7 个月

One special case of the "counter jumps" problem is the wraparound. Gray code only works reliably for counters wrapping at powers of two - otherwise the wraparound toggles more than one bit and hence is unsafe. Most obvious impact to your daily life: always configure async FIFO depth to a power of two.

Michael J?rgensen

Senior FPGA Developer at Weibel Scientific

10 个月

On Xilinx FPGA's I use the XPM macros to implement CDCs. I'm led to believe that this is sufficient to get all the constraints correct and thus eliminate all the error modes given in your excellent article. What are your thoughts on this ?

Rudolf Usselmann

ASIC/FPGA design engineer

10 个月

Totally disagree with you. A linear advancing gray code will always work across CDC. There are no glitches.

Guillaume JOLI

Digital Front-end Lead on mixed signal SoC ASICs

11 个月

Unfortunately set_bus_skew is not standard SDC command but Xilinx specific command. set_max_skew is the Altera Timequest (ex intelFPGA ;)) equivalent. set_data_check can be used in DC but for STA only.

要查看或添加评论,请登录

Lukas Vik的更多文章

  • The many ways to mess up ready/valid handshaking

    The many ways to mess up ready/valid handshaking

    Handshaking using ready/valid is the backbone of almost all FPGA design in the 2000s. It is used for memory…

    22 条评论
  • Reliable CDC constraints #5: Asynchronous FIFO

    Reliable CDC constraints #5: Asynchronous FIFO

    When I google “asynchronous FIFO github” and look through the first four pages, I find flaws in every single design…

    44 条评论
  • Reliable CDC constraints #4: Build tool settings

    Reliable CDC constraints #4: Build tool settings

    All the effort we spent in previous articles to constrain our clock domain crossings is wasted unless your FPGA build…

    19 条评论
  • Reliable CDC constraints #3: Pulses

    Reliable CDC constraints #3: Pulses

    Synchronizing pulses between clock domains is tricky. It's a problem with no perfect solution – there are only…

    25 条评论
  • Reliable CDC constraints #1

    Reliable CDC constraints #1

    In my career, I have met very few people who understand clock domain crossings (CDC) on a deep level. Which is okay, I…

    74 条评论

社区洞察

其他会员也浏览了