登录查看更多内容

Accelerated Computing Series Part 1: Custom IP & Control Plane

Neeraj Kumar, PhD

AI Scientist | ex-CTO@iRFMW, Huawei, BertLabs | IISc | Reconf. Edge Compute (FPGAs/GPUs), RTOS/Linux | Sensor Fusion | Radar, SDR | Cognitive Systems

发布日期: 2024年9月16日

+ 关注

Welcome to Part 1 of the series where we explore the basics of IP generation and its control. The more usual approach is to take an existing HDL (Verilog/VHDL) code and wrap it with the AXI interface. But in our case, for simplicity, we use Xilinx Vivado’s IP generation workflow and modify it with some Verilog tweaks to incorporate a very simple logic, a counter. The counter’s four MSBs drive the four LEDs on our FPGA board. We use the Pynq-z2 FPGA board, a very popular entry-level board for the Pynq (Python + Zynq) framework. We’ll discuss Pynq in another series maybe, but today we’re going to use it as a custom FPGA board. We’ll build and boot Linux on it from scratch, and see how to control our custom IP in the PL in a couple of ways, both from the user-space.

Hey Pynq, blink them LEDs!

Generating HDL

The first step is to create the HDL design, for which we need to create an empty project in Vivado. You’ll need to get the board files for the Pynq-z2 board from the TUL website and add to Vivado. Alternatively, during the project creation phase, search for the board and download the board files if not already added. This will initialize the board constraints & settings for you.

Once the project is created, create a block design, add the ZYNQ7 processing system and run block automation. This will initialize the PS settings & peripherals according to the board. Now you are ready to add a blank custom IP.

You can follow the steps described for the reference design on Xilinx’s docs page, which essentially does the following:

1) In the block design, select Tools -> Create and Package New IP.

2) Create a new AXI4 peripheral and select default properties.

3) In the add interfaces page, select Edit IP and Finish.

4) A new project window opens that lets you edit your custom IP. You’ll see a Package IP window displaying steps involved as follows:

5) In the Hierarchy view of the Sources tab, double click to open myip_v1_0 (or whatever name you gave in the earlier steps). Now is the time to modify the Verilog code according the link shared above.

Basically, you’ll need to add led ports, map port connections, add a 28-bit counter register, and assign 4 MSBs to the led ports. Following this, you’ll need to find the placeholder: ‘// Add user logic here’ and add the following logic that increments the count while the slave register slv_reg0 is set to 0x1. If unset, it stops and won’t increment.

// Add user logic here
    // on positive edge of input clock
    always @( posedge S_AXI_ACLK)
    begin
        //if reset is set, set count = 0x0
        if ( S_AXI_ARESETN == 1'b0 )
            begin
                count <= 28'b0;
            end
        else
            begin
                // when slv_reg0 is set to 0x1, increment count
                if ( slv_reg0 == 2'h01)
                    begin
                        count <= count+1;
                    end
                else
                    begin
                        count <= count;
                    end
            end
     end
// User logic ends

6) Save the changes and go back to packaging options, select Ports and Interfaces.

7) Click the Merge Changes from Ports and Interfaces Wizard link. This will update the ports and Interfaces, you’ll see ‘leds’ port appear here.

8) Under Packaging Steps, select Review and Package, and click Re-Package IP. Close the project.

9) You should now be able to search and import the IP in your project. Run Connection Automation to connect it to the Zynq7 PS’s M_AXI_GP0 port.

10) Right-click on the leds port and select Make External.?

Your design should look like as illustrated below:

11) Since we’re using Pynq-z2 board instead of the ZC702 used in the Xilinx reference design, this is where you need to deviate from that design. In the Flow Navigator, click on RTL Analysis -> Open Elaborated Design. Then select Window -> IO Ports, and make sure you set the following for the led port:

12) Finally, select Generate Bitstream. Once done, select File -> Export -> Export Hardware, select Include bitstream in the dialog box that follows. You should remember where you save this ‘.xsa’ file.

Building Linux

To build Linux from scratch to be able to run on our board, we need to choose a build system. Two of the most popular ones are Buildroot, and Yocto. A build system is essentially a bunch of scripts & configurations (layers, recipes) that allow one to choose packages & their versions, cross-compilers, and libraries, that are required to compile the kernel for a particular hardware architecture, and generate a bootable image and a rootfs. Build systems also help in building first-stage bootloaders, u-boot and its environment which then loads the kernel. Build systems are also augmented by vendor provided tools to generate other essential stuff, such as the device-tree blob (dtb) & bitstream that are passed to both u-boot and the kernel at the boot time.

In this series, we’ll use the Petalinux build system from Xilinx/AMD. It is a set of commands on top of the Yocto build system that smoothens the build process for Xilinx/AMD devices and integrates well with their ecosystem products, Vivado for HDL & Vitis for embedded software. Let’s get started.

First we need to download and install Petalinux for our OS. I have built & tested this project with Petalinux version 2023.1 on Ubuntu 22.04. But others should work too. Once installed, we need to create a project. There are a couple of ways to do it. If you are working on a standard Xilinx/AMD board, you should be able to get a board support package (bsp) file for the board. You need to pass that to the ‘petalinux-create’ command to create a project. On the other hand, if you are working on a custom board, for which a bsp may not be available, as is the case with Pynq-z2, we need to pass the .xsa (hardware handoff) file that we generated from our Vivado project above. Execute the following command:

$ petalinux-create -t project --template zynq --name ptlnx23.1_pynqz2
$ petalinux-config –get-hw-description=</path/to/xsa_file>

The second command will open up a configuration user interface window

We don’t need to change anything here for our project, but I prefer to keep our rootfs on the second partition of the sdcard, instead of the initramfs (default in petalinux). You’ll find it in the Image Packaging Configuration:

Save & exit. If you wish to add some more packages such as Avahi services, enabling auto-root login (not recommended in production), busybox (needed for devmem command) tools, etc, you can do so by configuring the rootfs:

$ petalinux-config -c rootfs

?Before we build the kernel and bootable images for our board, we need to compile the driver provided with the reference design as a Linux kernel module (LKM):

$ petalinux-create -t modules --name blink --enable

This creates a template module in the <project-directory>/meta-user/recipes-modules/ directory. You’ll need to overwrite the contents of the blink.c file with that provided by the reference design and add blink.h to the folder. You also need to edit the blink.bb recipe to include blink.h in SRC_URI. Follow the reference design for instructions.

Finally execute:

$ petalinux-build

?This will create images for first stage bootloader (fsbl), U-Boot, and Linux, along with system.dtb, system.bit in the <project-directory>/images/linux directory.

Note that Petalinux has extracted system.dtb and system.bit from the .xsa file we passed to it above. The bitfile is the PL hardware that the kernel is made aware of via the system.dtb device-tree files, which also contains information about the rest of the PS configuration & peripherals.

To get a bootable image execute:

$ petalinux-package --boot --fsbl --fpga --u-boot --force

?This will create a BOOT.BIN image consisting of fsbl, bitstream, u-boot, and system.dtb.

We need to create an sdcard with two partitions: BOOT (fat32) & ROOTFS (ext4).

Copy boot.scr, BOOT.BIN & uImage (Linux) into the BOOT partition, and extract rootfs.tar.gz into the ROOTFS partition.

Boot the board from the sdcard. If you reach the login prompt, everything went well.

Bringing it all together

Have a look at the following architecture.

新思科技 2 个月前

Performance Measurements of VProc on Verilator

Simon Southwell 4 个月前

Microcode Vulnerabilities: A Gateway to Espionage Part…

Arunas Girdziusas 6 个月前

So far we have covered the creation of the custom IP and assigning the ports to the onboard LEDs on the PL side, and buillding and booting up the kernel from the PS side. But you won’t see anything happening to the LEDs yet. That’s because the default value for the register is 0. You need to write 0x1 to the register to enable the counter. So, how do we write 0x1 to the register from the user space??

There are two approaches, and both involve char devices as we saw in the Linux device driver series.

Approach 1:

The first approach is a quick & dirty way of using a userspace tool, devmem, that accesses the map of the entire physical memory of non-RAM addresses (such as IO devices) to the virtual memory via mmap. This mmap is over a file descriptor of ‘/dev/mem’ char device. If proper care is not taken, one can write to a wrong location and may cause a kernel crash.

Let’s check the contents of the register. The register address is 0x43C00000 as ascertained from Vivado Address editor.

root@petalinux23:~# devmem 0x43C00000
0x00000000

So, it’s value is indeed 0x0. Let’s set it to 0x1 and check again:

root@petalinux23:~# devmem 0x43C00000 w 0x1
root@petalinux23:~# devmem 0x43C00000
0x00000001
root@petalinux23:~# devmem 0x43C00000 w 0x0
root@petalinux23:~# devmem 0x43C00000
0x00000000

You should now be able to see the LEDs blinking.

Approach 2:?

The second approach is a safer driver based memory mapping of the register address. So, there’s no chance of accidently writing outside the area allocated by the kernel for the device.

If you see the driver initialization function, you’ll see ioremap being called for the register physical address which maps it to a kernel virtual address and size 0x100. Any driver read/writes to this virtual address translate to read/writes to the register address via the MMU.

static int __init blink_init(void)
{
	...	
	mmio = ioremap(BLINK_CTRL_REG,0x100);
	...
}

You’ll see the driver code registering ioctl commands to set (0x1) and reset (0x0) this register virtual address:

?static void set_blink_ctrl(void)
{
	printk("KERNEL PRINT : set_blink_ctrl \n\r");
	*(unsigned int *)mmio = 0x1;
}

static void reset_blink_ctrl(void)
{
	printk("KERNEL PRINT : reset_blink_ctrl \n\r");
	*(unsigned int *)mmio = 0x0;
}

long device_ioctl(		struct file *file, /* ditto */
					unsigned int ioctl_num, /* number and param for ioctl */
					unsigned long ioctl_param)
{
	char *temp;
	switch (ioctl_num) 
	{
	case IOCTL_ON_LED:
		temp = (char *)ioctl_param;
		set_blink_ctrl();
	break;
	case IOCTL_STOP_LED:
		temp = (char *)ioctl_param;
		reset_blink_ctrl();
	break;
	
	}
	return SUCCESS;
}

And this Fops callback structure is registered in the module initialization function, any usespace ioctl read/writes will trigger these functions:

struct file_operations Fops = {
								.owner = THIS_MODULE,       
								.read = device_read,
								.write = device_write,
								.unlocked_ioctl = device_ioctl,
								.open = device_open,
								.release = device_release, /*close */								
						};

User application to talk to the driver

?Now, we need a user application that talks to this driver via ioctl commands. At this point again, as per the reference design doc, create an application project in Vitis for Linux with emtpy application template. Add the provided linux_blinkled_app.c and blink.h files into the sources folder of the project and build the project.

You need to copy the generated linux_blinkled_app.elf somewhere in the sdcard rootfs for easy access.

Loading the driver

Since it is a kernel module, the driver is not baked into the kernel, it needs to be loaded first. Navigate to /lib/modules/<kernel_version>/extra, and run the command:

$ modprobe blink.ko

?To create the char device, run the following command:

$ mknod /dev/blink_dev c 244 0

Executing the application

Convert the .elf application to executable

$ chmod 777 linux_blinkled_app.elf

?and execute:

$ ./linux_blinkled_app.elf

The application would ask you to enter 1/0 to set/reset the register, to same effect that we saw above with the devmem command.

?Congrats, now you know how to make a custom IP with some registers, and control it though these registers via device drivers and userspace commands in Linux.

In this part we used the AXI-Lite protocol, but we didn’t discuss it. In the next part, we’ll go deeper into the actual AXI protocols and bring in AXI-Stream for the data plane.

Stay tuned!

Other parts in this series:

Part 0: Linux, FPGAs, GPUs, and some coffee!

Part 2: Streaming Dataplane & Linux Drivers

Part 3: Deep Learning Accelerator on FPGA & Linux Drivers

Part 4: Smart Camera with NLP on FPGA & Linux

Part 5: FPGA over PCIe & Linux

My previous related series:

Embedded Linux Weekend Hacking: Linux Device Drivers

Tyler Nieland

PMP & Engineering Technology Student

2 个月

This looks an excellent series. I have added this to my reading list and I plan to go over this in detail. Thanks!

1 次回应

Karthik Selvan

Experienced Professional in RFSOC/FPGA, Digital Design, Signal Processing, Medical, Quantum, Ground-Station/Satellite Terminal Modem FSO Communication Engineering.

2 个月

Good article, Neeraj Kumar, PhD. Thanks

1 次回应

查看更多评论

要查看或添加评论，请登录

Neeraj Kumar, PhD的更多文章

Software Defined Radios Part 1: TUN/TAP Linux Virtual Network Device

2024年11月22日

Software Defined Radios Part 1: TUN/TAP Linux Virtual Network Device

Software Defined Radios (SDRs) have been transforming modern digital communications, migrating far complex signal…
Accelerated Computing Series Part 5: FPGA over PCIe & Linux

2024年11月5日

Accelerated Computing Series Part 5: FPGA over PCIe & Linux

Over the past few years, edge-computing has gained tremendous interest for applications involving mobile platforms such…

4 条评论
Accelerated Computing Series Part 4: Smart Camera with NLP on FPGA & Linux

2024年10月1日

Accelerated Computing Series Part 4: Smart Camera with NLP on FPGA & Linux

Building on the heels of the last blog, here’s a quick demonstration of an application that leverages the DPU…
Accelerated Computing Series Part 3: Deep Learning Accelerator on FPGA & Linux Drivers

2024年9月27日

Accelerated Computing Series Part 3: Deep Learning Accelerator on FPGA & Linux Drivers

Another weekend, another tech blog! This time we’re going to delve deeper into one of FPGAs’ key strengths, hardware…

6 条评论
Accelerated Computing Series Part 2: Streaming Dataplane & Linux Drivers

2024年9月19日

Accelerated Computing Series Part 2: Streaming Dataplane & Linux Drivers

In this part of the series we explore the dataplane. In Part 1, we used an AXI-Lite interface (for the control plane)…

2 条评论
Accelerated Computing Series Part 0: Linux, FPGAs, GPUs, and some coffee!

2024年9月13日

Accelerated Computing Series Part 0: Linux, FPGAs, GPUs, and some coffee!

Welcome to this fun series on accelerated computing where we explore ideas on designing co-processors/accelerators…

1 条评论
Embedded Linux Weekend Hacking, Part 5: Interfacing Sensors via the Industrial I/O (IIO) Subsystem

2024年8月4日

Embedded Linux Weekend Hacking, Part 5: Interfacing Sensors via the Industrial I/O (IIO) Subsystem

After a long hiatus, I have decided to get back to what I truly enjoy, technical writing! So, here I am with another…

3 条评论
Embedded Linux Weekend Hacking, Part 4: Platform Device Drivers

2022年4月8日

Embedded Linux Weekend Hacking, Part 4: Platform Device Drivers

This is part 4 of the series 'Weekend Hacking, Embedded Linux Device Driver (LDD) Development from the Ground Up'. So…

2 条评论
Embedded Linux Device Driver Development, Part 3: Coding Your First Driver

2022年3月6日

Embedded Linux Device Driver Development, Part 3: Coding Your First Driver

This is part 3 of the series 'Weekend Hacking, Embedded Linux Device Driver (LDD) Development from the Ground Up'. In…
Embedded Linux Device Driver Development, Part 2: The Linux Device Model

2022年2月26日

Embedded Linux Device Driver Development, Part 2: The Linux Device Model

This is part 2 of the series 'Weekend Hacking, Embedded Linux Device Driver (LDD) Development from the Ground Up'. In…

See all articles

Accelerated Computing Series Part 1: Custom IP & Control Plane

Neeraj Kumar, PhD

AI Scientist | ex-CTO@iRFMW, Huawei, BertLabs | IISc | Reconf. Edge Compute (FPGAs/GPUs), RTOS/Linux | Sensor Fusion | Radar, SDR | Cognitive Systems

Hey Pynq, blink them LEDs!

Generating HDL

Building Linux

Bringing it all together

领英推荐

Neeraj Kumar, PhD的更多文章

社区洞察

其他会员也浏览了

C++ Modelling of SoC Systems Part 1: Processor Elements

Communication runtimes in Parallel Programming with GPUs in HPC Cluster

From More to Moore: Breakthrough FPGA State Machines with Category Theory

How NETINT enables ASIC upgradeability with Software

Exploring Bit Scan Forward

??? The Importance of Soft Processors in #FPGAs for Customized Digital Circuits

The Top 5 RP2040 Boards in 2023

RTL vs. Software Mentality in FPGA/ASIC Design; Latency From 161 to 2 Clock Cycle!

FPGA: From ground up!

Accelerate Development of High-Performance Products for Aerospace and Defense on Powerful FPGA Platforms

Hey Pynq, blink them LEDs!

Generating HDL

Building Linux

Bringing it all together

领英推荐

Neeraj Kumar, PhD的更多文章

Software Defined Radios Part 1: TUN/TAP Linux Virtual Network Device

Accelerated Computing Series Part 5: FPGA over PCIe & Linux

Accelerated Computing Series Part 4: Smart Camera with NLP on FPGA & Linux

Accelerated Computing Series Part 3: Deep Learning Accelerator on FPGA & Linux Drivers

Accelerated Computing Series Part 2: Streaming Dataplane & Linux Drivers

Accelerated Computing Series Part 0: Linux, FPGAs, GPUs, and some coffee!

Embedded Linux Weekend Hacking, Part 5: Interfacing Sensors via the Industrial I/O (IIO) Subsystem

Embedded Linux Weekend Hacking, Part 4: Platform Device Drivers

Embedded Linux Device Driver Development, Part 3: Coding Your First Driver

Embedded Linux Device Driver Development, Part 2: The Linux Device Model

社区洞察

其他会员也浏览了

C++ Modelling of SoC Systems Part 1: Processor Elements

Communication runtimes in Parallel Programming with GPUs in HPC Cluster

From More to Moore: Breakthrough FPGA State Machines with Category Theory

How NETINT enables ASIC upgradeability with Software

Exploring Bit Scan Forward

??? The Importance of Soft Processors in #FPGAs for Customized Digital Circuits

The Top 5 RP2040 Boards in 2023

RTL vs. Software Mentality in FPGA/ASIC Design; Latency From 161 to 2 Clock Cycle!

FPGA: From ground up!

Accelerate Development of High-Performance Products for Aerospace and Defense on Powerful FPGA Platforms