登录查看更多内容

FPGA Congestion

Sampath VP

ASIC/FPGA Design Professional | SoC Architecture | Technology Evangelist | IEEE Reviewer|

发布日期: 2020年5月23日

Project Everest devices, where a network-on-chip will help reduce congestion across the devices, which will deploy billions of transistors through the use of a 7nm process. Plunify’s algorithms have found cases where moving logic out of hardwired blocks can improve overall timing.Possibly it can reduce the congestion caused by the need to dedicate routing to connect to the fixed-location cores. A common technique for reducing congestion overall is to focus on the Rent exponent – a measure of the number of connections each block within the design needs.

Tool reports provide connectivity measurements for each block and the optimisation tools can focus effort on elements that tend to increase the Rent exponent.These strategies reduce congestion by selectively reducing the utilisation of structures that tend to increase Rent and congestion.

During the training phase, with several HLS-based applications, we run one time of the complete C-to-FPGA flow to obtain the routing congestion metrics. The C/C++ specifications of designs are synthesized into RTL models through the HLS flow, and then the RTL descriptions run through the implementation flow to generate the congestion metrics. With the t model, the highly congested regions in the source code of the target design can be detected during the prediction phase and users can resolve congestion issues in the HLS flow without running the time-consuming RTL implementation flow.

Designs that have high utilization will be more susceptible to congestion. Generally, a design that is over 80% utilized (Slice LUTs) will become very difficult to route and meet timing. The actual percentage where this difficulty is seen is highly design dependent, and is affected by such factors as the number of control sets (clock, reset, and enable) in the design, and high fanout nets. Also, over-utilization of other site types such as FFs, LUTRAMs, block RAMs, and DSP sites can also lead to congestion.

One suggestion to overcome such congestion would be to balance the utilization of different types of sites. For instance, if many LUTRAMs are causing an over-utilization of LUTs, then moving some of these to block RAM sites could help.

report_high_fanout_nets - Finding high fanout nets can be crucial in fighting congestion. Specifically, non-clock control signals that have a high fanout can cause congestion. You can make a list of high fanout nets with synchronous drivers and use the following command to relieve congested areas.

phys_opt_design -force_replication_on_nets

If a high fanout net is driven from a LUT and cannot be replicated with phys_opt_design, then this can either be manually replicated in the RTL, or a global buffer (BUFG) can be added if the added delay is acceptable.

report_design_analysis - The report_design_analysis -complexity command can also be used to see if a design is more complex and susceptible to congestion.

report_design_analysis -help gives additional information on how to use this. From the IDE, navigate to

Tools -> Report -> Report Design Analysis.

Example command:

report_design_analysis -congestion -complexity -hierarchical_depth 10

This will analyze the complexity "Rent" and specify this for 10 hierarchical levels. The -congestion option will also list the most heavily utilized routing tiles.

report_qor_suggestions - Running the report_qor_suggestions command on a routed design can give valuable feedback on constraint, Tcl, and design changes that can help with congestion problems. The focus on the command is QOR improvements relating to timing critical paths, but there are congestion specific suggestions, when specific congestion scenarios are detected.

Logic related to Congestion - Find the logic that occupies the congested tiles by building a schematic. This logic can be checked for connecting high-fanout nets directly related to this congestion.

Local vs Global Congestion - Congestion can be local to a certain region of the device, even when the overall device utilization is low. In this case, certain restrictions such as I/O connections or area constraints such as pblocks can cause the congestion and should be checked. Try loosening or removing pblock constraints to see how the results differ.

要查看或添加评论，请登录

Sampath VP的更多文章

Deeplearning what does it fits?

2025年2月20日

Deeplearning what does it fits?

There is a huge enthusiasm for cognitive computing , artificial intelligence , machine learning , deep learning and…
Switch to Stand out

2024年9月11日

Switch to Stand out

Bharath Semiconductor Society which emphasis on the ESDM, Semiconductor, entrepreneurship,MSMEs and academics. Since…

1 条评论
ASIC RTL vs FPGA RTL

2024年8月29日

ASIC RTL vs FPGA RTL

The biggest difference between RTL design for ASIC and RTL design for FPGA is that ASICs are custom-designed integrated…
DV TALK 31ST AUGUST DONT MISS IT!

2024年8月17日

DV TALK 31ST AUGUST DONT MISS IT!

DV TALK Greetings from Bharath Semiconductor society.Bharath Semiconductor Society of India was established in 2022 as…

3 条评论
RTL Coding in FPGA

2024年6月10日

RTL Coding in FPGA

Module designers shall have detailed view of the design down to function/major component level for near-accurate…
Transaction layer of PCIe

2024年5月31日

Transaction layer of PCIe

Transaction layer Transaction layer’s primary responsibility is to create PCI Express request and completion…
Deep learning designs

2024年5月30日

Deep learning designs

DL designs for training can be a large size due to the shear amount of high precision MACs, memory routing, and…
PCIe Equalization phases

2024年5月30日

PCIe Equalization phases

Equalization is a critical aspect of PCIe technology that ensures the integrity of data transmission in increasingly…
PCIe Equalization

2024年5月29日

PCIe Equalization

· PCIe 3.0: Gen 3 introduced static equalization, primarily performed by the transmitter using 128/130 encoding.
PCIe Enumeration

2024年5月28日

PCIe Enumeration

PCIe enumeration is the process of detecting the devices connected to the PCIe bus. switches and endpoint devices are…

See all articles

FPGA Congestion

Sampath VP

ASIC/FPGA Design Professional | SoC Architecture | Technology Evangelist | IEEE Reviewer|

Sampath VP的更多文章

社区洞察

其他会员也浏览了

Semiconductor news

Fraunhofer, PCI-SIG Compliance, and ACE 9.1 Drop

A Journey Through Process Node Evolution: From 25 Microns to Modern Nanometers

Field-Programmable Gate Arrays (FPGAs) Simulator in?Rust

Semiconductor Memory Evolution And Current Challenges

SoC vs. Chiplets: What’s the Difference?

The difference between IP, SoC, SiP and Chiplet

Competitors Over ASIC A Paradigm

Unlocking the Power of ARM: AMBA Potential

The Holy Grail: A Self Authoring FPGA

Sampath VP的更多文章

Deeplearning what does it fits?

Switch to Stand out

ASIC RTL vs FPGA RTL

DV TALK 31ST AUGUST DONT MISS IT!

RTL Coding in FPGA

Transaction layer of PCIe

Deep learning designs

PCIe Equalization phases

PCIe Equalization

PCIe Enumeration

社区洞察

其他会员也浏览了

Semiconductor news

Fraunhofer, PCI-SIG Compliance, and ACE 9.1 Drop

A Journey Through Process Node Evolution: From 25 Microns to Modern Nanometers

Field-Programmable Gate Arrays (FPGAs) Simulator in?Rust

Semiconductor Memory Evolution And Current Challenges

SoC vs. Chiplets: What’s the Difference?

The difference between IP, SoC, SiP and Chiplet

Competitors Over ASIC A Paradigm

Unlocking the Power of ARM: AMBA Potential

The Holy Grail: A Self Authoring FPGA