SQL - born for functional coverage (1)
In the previous post I have claimed that SQL queries of log files could deliver all forms of coverage available as part of SystemVerilog coverage syntax, and much more. In this post, I'll focus on showing that SQL can do everything you can do with SystemVerilog coverage syntax, and in the next post, I'll focus on the "much more". By the time we're done with these two posts, I hope you’ll be starting to see the possibilities that moving to from SV coverage to SQL opens. Needless to say, that all SQL query examples I'll be showing can be fine-tuned, focused and extended without rerunning any simulations, which would be impossible with SV coverage flow. Since I assume readers of this posts are not SQL experts (neither am I btw), and have no plans to become ones, I will be explaining the queries at a high level only, which would give a good enough understanding of the possibilities and few limitations. For those few who would like to get their hands dirty and dive-deep into the SQL, you can find all code examples, and a few more, running in this SQL fiddle.
For the purpose of the example, assume that we have parsed the transaction log files collected on an AXI interface and placed the transactions in an SQL table called axi_if_1. I’ll be referring to the process of parsing the log files and placing them in SQL tables a couple of posts down, so for the moment, just assume it solved. I guarantee there’s very little magic in it apart from some regular expressions and SQL create table statement. The first few lines and columns of the table axi_if_1 are shown below.
time rd_wr addr burst len 150 RD 165377426 INCR 12 250 RD 2310710676 FIXED 13 350 WR 2328599037 FIXED 15 360 WR 2921785595 INCR 6 500 RD 1490070710 INCR 0 550 RD 3668650794 FIXED 8 1000 WR 1314868187 INCR 13 1100 RD 3114753989 FIXED 10 1110 RD 2032547025 FIXED 14 1120 WR 2834194867 INCR 4 1490 RD 4294967295 FIXED 8
To warm ourselves up, let’s try to figure out what burst type values have been covered on this interface. The following query, would give us all distinct values in the burst type column:
SELECT DISTINCT burst FROM axi_if_1
burst INCR FIXED
Those of you who are even slightly familiar with AXI will immediately note that we have a coverage hole here, as the WRAP value doesn’t appear in any transaction. However, if you don’t know AXI, you can’t tell there’s a coverage hole from the results. Why don’t we see a hole in the table above? Well obviously SQL can only look at the transactions that happened, not at those that didn’t happen. Therefore, without additional information about the expected coverage space, it can’t tell us anything about coverage holes. This behaviour, although obvious, is somewhat surprising if you’re used to SV coverage, since SV does provide information about coverage holes, without any input about the expected coverage space from the user. For example, if you have the following:
typedef enum burst_type_t {FIXED = 0, INCR = 1, WRAP =2};
//…
burst_type_t burst;
covergroup axi_tr;
burst : coverpoint burst;
endgroup
SV will automatically show a coverage hole for the WRAP value, because it uses the type definition as a coverage space hint. Since SQL doesn’t have access to SV type information, it can’t do SV’s trick. To provide it with the information that SV obtains via type definitions, we will need to create tables like the ones below for our enum types. Such information is rather easy to get into a log file, so for the moment don’t worry how it got here.
name value
RD 0
WR 1
name value
FIXED 0
INCR 1
WRAP 2
And now that we have the type information, we can modify our query to look like below, where we (refer to the numbers in the code snippet):
- Select the name column from the burst_type table shown above. This column contains our expected values
- Match each name value to all transactions with a corresponding burst field, from axi_if_1. The only value that will be left unmatched by any transaction is WRAP
- Collapse all lines that share an identical burst_type.name into one, leaving us with the 3 entries we were expecting.
- Polish it using an *if* function, to present True/False values
SELECT DISTINCT # (3)
t.name AS expected, # (1)
IF(axi.burst IS NOT NULL, 'TRUE', 'FALSE') AS hit #(4)
FROM burst_type t
LEFT JOIN axi_if_1 axi
ON t.name=axi.burst #(2)
expected hit
INCR TRUE
FIXED TRUE
WRAP FALSE
Moving to cross coverage is where you’ll first see that the title of this post was no exaggeration. Assume we want to cross burst type and rd/wr, if we have the two type definitions as shown above, all we have to do to generate a table containing the coverage space is preform the following query. Note how SQL stole the keyword CROSS from SystemVerilog cross coverage ??.
SELECT
t1.name AS burst,
t2.name AS direction
FROM burst_type t1
CROSS JOIN rdwr_type t2
burst direction
FIXED RD
FIXED WR
INCR RD
INCR WR
WRAP RD
WRAP WR
To match those expected values to the actual values, we add a left join with the AXI transactions table, but this time we match lines that have both burst and rd/wr equal.
SELECT DISTINCT
t1.name AS burst,
t2.name AS rd_wr,
IF(axi.burst IS NOT NULL, 'TRUE', 'FALSE') AS hit
FROM burst_type t1
CROSS JOIN rdwr_type t2
LEFT JOIN axi_if_1 axi
ON t1.name=axi.burst AND t2.name=axi.rd_wr
burst direction hit
INCR RD TRUE
FIXED RD TRUE
FIXED WR TRUE
INCR WR TRUE
WRAP RD FALSE
WRAP WR FALSE
On AXI it is common to cross direction, and memory segment, to make sure all segments were accessed. Using SQL variables, we can easily split the address range in 4, and generate a new list of expected buckets with the following query. The *SELECT ctr etc.* part in the brackets, is simply SQL’s somewhat funny way of generating a column with numbers 1,2,3,4.
SET @buckets = 4
SELECT
t1.name AS direction,
t2.ctr * 1000000000 AS segment
CROSS JOIN rdwr_type t1
CROSS JOIN (SELECT ctr FROM ctr_to_100 WHERE ctr < @buckets) t2
name segment RD 0 WR 0 RD 1000000000 WR 1000000000 RD 2000000000 WR 2000000000 RD 3000000000 WR 3000000000
Matching it to the actual results only requires one additional *AND* in the condition that does the matching. I also added *ORDER BY* at the end so that combinations are easy to follow
SELECT DISTINCT
t1.name AS direction,
t2.ctr * 1000000000 AS segment,
if(axi.addr is not null, 'TRUE', 'FALSE') AS hit
CROSS JOIN rdwr_type t1
CROSS JOIN (SELECT ctr FROM ctr_to_100 WHERE ctr < @buckets) t2
LEFT JOIN axi_if_1 axi ON
t1.name=axi.rd_wr AND
t2.ctr=floor(axi.addr/1000000000)
ORDER BY direction,segment
name segment hit
RD 0 TRUE
RD 1000000000 TRUE
RD 2000000000 TRUE
RD 3000000000 TRUE
WR 0 FALSE
WR 1000000000 TRUE
WR 2000000000 TRUE
WR 3000000000 FALSE
To ignore one of the combinations in the cross, just exclude it using WHERE:
SELECT DISTINCT
t1.name AS burst,
t2.name AS rd_wr,
IF(axi.burst IS NOT NULL, 'TRUE', 'FALSE') AS hit
FROM burst_type t1 CROSS JOIN rdwr_type t2
LEFT JOIN axi_if_2 axi ON
t1.name=axi.burst AND
t2.name=axi.rd_wr
WHERE
t1.name <> 'INCR' OR
t2.name <> 'RD'
burst direction hit
FIXED RD TRUE
FIXED WR TRUE
INCR WR TRUE
WRAP RD FALSE
WRAP WR FALSE
To finish our 101 SV to SQL dictionary, to get a coverage number for your cover groups just replace TRUE with ‘1’ and FALSE with ‘0’, and average on the hit column. Here's how you'd do that on a cross of burst X direction X segment:
SELECT AVG(hit)*100 AS coverage_number FROM (
SELECT DISTINCT
t1.name AS burst,
t2.name AS direction,
t3.ctr * 1000000000 AS segment,
IF(axi.addr IS NOT NULL, 1, 0) AS hit
FROM burst_type t1
CROSS JOIN rdwr_type t2
CROSS JOIN (SELECT ctr FROM ctr_to_100 WHERE ctr < @buckets) t3
LEFT JOIN axi_if_2 axi ON
t1.name=axi.burst AND
t2.name=axi.rd_wr AND
t3.ctr=floor(axi.addr/1000000000)) t4
coverage_number 29.1666
Starting to get the hang of it? Even enjoy it in some weird way? You can take a deeper look at the SQL fiddle, and tweak the above examples and a few more to implement any missing features of SV coverage that you fancy. At a high level, I can say that practically anything you’re used to do with SV coverage syntax, can be done with SQL. Furthermore, you can wrap SQL in python, to create coverage functions that are almost 1 to 1 with the “good”, old SV API, and keep all SQL syntax under the hood. As promised in the first post, I will show how this is done when wrapping this series of posts up. However, we have not gathered here to get more of the same. Yes, all above queries are dynamic, can be done long after simulation has ended, can be modified and debugged on the fly, but still, they only give you the same information in a more convenient way. It would be unfair for SQL to leave the comparison here, as it could do much more with the data at hand. In the next post, we will go much beyond SV coverage.
Staff Verification Engineer
4 年I find this really interesting just today, I have tried similar experiments with non-sql databases, i.e mongodb and it seems to work very well, I prefer non-sql because I was able to modify the coverage model more easily and the query mechanism seemed more easy to implement, thank you for sharing
Lead Software Engineer at Instinct Science
6 年Have you looked at databases that are created specifically for log file storage like AWS Kinesis or Apache Kafka? These tools have been optimized for this process and seem like they'd be more efficient than trying to represent log data in a relational way and using SQL.
Anupam Bakshi, thanks for the question. To set the context, huge log files that need to be parsed are by no means unique to HW verification. Actually, when you compare HW verification log files to those generated by other applications using SQL for data analysis, such as all sorts of web services and web servers, you might not call them *huge* anymore. Still, many such applications are making daily use of various SQL services on the cloud for their data analysis. Now answering specifically your question, parsing and ingesting logs into a coverage database is usually a per-regression activity - looking at coverage for a regression rather than for one test is the more common use case. This means that, according to your requirements, you can use the entire regression run time to upload (assuming cloud SQL), parse and ingest records. So basically you parse/ingest logs, from one test, while other tests are running. You can also parse/ingest logs while the test itself is running, which would require a slightly more complicated scripting, but nothing too complicated. Finally, if you really want to see coverage instantaneously (which is not likely), since we're talking cloud, you can always get enough computing power to do so. Parsing/Ingesting speed is a direct function of your compute. Last comment in this TL;DR answer - as you probably know, ingesting and especially merging of databases from different simulations take time also with regular SV solutions. The fact that ingestion time is part of run-time doesn't mean it is not there. Merge time is more obvious, and can be pretty long depending on the size of your database.
Founder & CEO | Semiconductor Innovator | MBA | 30+ Years in EDA, Hardware Design & Verification | Building Brands & Global Teams
6 年Interesting article. But if you have a huge log file that needs to be parsed, won't it be a big challenge to scale? Although SV coverage has its own scalability issues. I wonder which is the lesser of the two evils.
Senior Director, ASIC Verification and Emulation at Micron technology
6 年Interesting concept Avidan. I enjoy reading your articles and might even give it a try soon ! Keep sharing innovative ideas