(Full disclosure: though I work for Amazon, anything you read below represents my own personal opinion/experiments and nothing beyond that. In my day job at Amazon I have little to do with either HW verification or AWS)

In hindsight, making coverage collection, coverage analysis and checking a standard part of verification languages (e,SV), verification tools, and standard verification environments, was a mistake from the industry perspective. Unlike stimuli, although analysis and checking might have some nuances when it comes to RTL verification, these nuances are far from justifying a set of proprietary languages features, tools and methodologies. Working with proprietary techniques for RTL verification analysis and checking, has cut us from leveraging new technologies developed in these areas, has rendered recruiting and training harder than it should have been, and have probably made us pay more for products that could do less. Worse, it has cut RTL verification from analysis and checking in other parts of the hardware design flow from SystemC and Matlab simulations to emulation and FPGA prototyping.

With signs that EDA is finally moving to the cloud – Xilinx already has an FPGA design flow on AWS, Mentor has announced Veloce on AWS, and Cadence are hopefully going to announce some form of AWS service at DAC – RTL verification’s “splendid isolation” might be nearing its end. When verification data is on S3, it will be pretty hard to prevent curious engineers from using every tool out there to explore it, visualize it and analyze it. And when they discover what’s out there, it will be impossible for them to go back. Though initially, like any disruption, this might seem like a scary prospect, I believe that in the long term, it will make both DV teams and tool providers, focus on their strengths, rather than waste their time reinventing what was already invented elsewhere.

In this post, the fourth one in a series, I’ll be showing just how easy it is to get coverage data out of logs stored on AWS S3 using SQL queries. In previous posts I have explained why full post-simulation flow makes much more sense when it comes to coverage(1), and have shown how SQL can deliver anything SystemVerilog covergroups(2)/assertions(3) can, cutting the long turn-around time associated with both of these. Today, we will be connecting almost all the dots, and showing the flow from testbench output to SQL results, as described in the simple diagram below. If you feel like running all the steps for yourself, all you got to do is get yourself an AWS account, and then use this Jupyter notebook to run it all. For those of you worried about AWS costs, I’ve been preparing and running this demo and others for less than 1 cent (and even that is probably rounded up). For those of you who don’t know Jupyter, it is a way to embed executable code in python into documents. Take a look and you’ll fall in love.

As can be seen above, SystemVerilog’s role is reduced from analysing the data using covergroups, to merely printing out the information into a log file to be post processed later. An example of one way to do this is shown in the excerpt below (which you can find running on edaplayground here). The first $display, which can print to a different file or as the first line of the transaction file, provides information about the type of the fields that will be printed into the log later. As discussed at length in the 2nd post, this allows our SQL queries to display information about coverage holes, and also to create the right buckets for integers of different sizes. The second display, just prints those same fields into the log file. Pretty straightforward.

//initialization section: print type information for the fields in our log

$display("# Transaction meta: %s, %d, %s, %d, %d, %s", $typename(tr.dir), $size(tr.addr), $typename(tr.burst), $size(tr.len), $size(tr.id), $typename(tr.lock));


//run section: print the interesting parst of each transaction into the log

$display("# Time: %t, dir: %s, addr: %d, burst: %s, len: %d, id: %d, lock: %s,",$time(), tr.dir.name, tr.addr, tr.burst.name, tr.len, tr.id, tr.lock.name);


To get cross coverage including holes, we need to place the type information from the first SV $display statement, into a format that can be read into a database. To do this we write glue logic Python code (look it up on the Jupyter notebook if you’re curious) that opens the first $display statement into the following CSV/json files. The types.json file just has all enum values for each enumeration we use in our log. The columns.csv file provides type information for each field we print into the log in the $dispaly statement that prints the transaction.

> cat simple_tb/types_info/types.json

{"enum_type_name": "axi_vip::dir_t", "enum_string": "RD", "enum_int": "0"}
{"enum_type_name": "axi_vip::dir_t", "enum_string": "WR", "enum_int": "1"}
{"enum_type_name": "axi_vip::burst_t", "enum_string": "FIXED", "enum_int": "0"}
{"enum_type_name": "axi_vip::burst_t", "enum_string": "INCR", "enum_int": "1"}
{"enum_type_name": "axi_vip::burst_t", "enum_string": "WRAP", "enum_int": "2"}
{"enum_type_name": "axi_vip::lock_t", "enum_string": "NORMAL", "enum_int": "0"}
{"enum_type_name": "axi_vip::lock_t", "enum_string": "EXCLUSIVE", "enum_int": "1"}
{"enum_type_name": "axi_vip::lock_t", "enum_string": "LOCKED", "enum_int": "2"}

> cat simple_tb/axi_master_1/columns/columns.csv 

dir,      axi_vip::dir_t,    0
addr,     int,               32
burst,    axi_vip::burst_t,  0
len,      int,               4
id,       int,               4
lock,     axi_vip::lock_t,   0

Now assuming we have the logs and type information files in the following directory structure, we just upload them to S3 using the AWS CLI commands below, and hop, we’re done with step 2. Note that while transaction data belongs to a specific test, type data is more likely to be relevant to any test being run, which is why the first is under test1 while the second is directly under simple_tb

> tree simple_tb


├── test1

│  └── axi_master_1

│    ├── columns

│    │  └── columns.csv

│    └── log

│      └── transactions.log

└── types_info

  └── types.json

> aws s3 mb s3://coverage-demo/

> aws s3 sync simple_tb/ s3://coverage-demo/simple_tb --delete

To turn these logs into SQL tables, we use an AWS service called Athena, which basically creates SQL databases and tables from file on S3. For example, To put our transaction log into a table, we just run the following query. Athena can open json or csv natively, and can figure out a user defined format such as our transactions.log file, if you give it the right regular expression:

CREATE EXTERNAL TABLE coverage_demo.axi_if1_transactions (
   `time` bigint,
   `dir` string,
   `addr` bigint,
   `burst` string,
   `len` smallint,
   `id` smallint,
   `lock` string
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
        'input.regex'='# Time: *([^ ^,]*), dir: *([^ ^,]*), addr: *([^ ^,]*), burst: *([^ ^,]*), len: *([^ ^,]*), id: *([^ ^,]*), lock: *([^ ^,]*),'
    ) LOCATION 's3://coverage-demo/simple_tb/test1/axi_master_1/log/'
    TBLPROPERTIES ('has_encrypted_data'='false');


And now we’re ready to query our tables as we wish. For example, we can run the query from the last post that finds interrupted exclusive read/exclusive write pairs.

select first_tr.addr as addr, first_tr.time as read_time,  min(middle_tr.time) as interrupted_at, min(second_tr.time) as write_time
from (
    select row_number() over () as num, inner1.time, inner1.addr, inner1.dir, inner1.lock from
        axi_if1_transactions inner1
        inner1.dir = 'WR' or
        inner1.lock = 'EXCLUSIVE'
    order by inner1.addr, inner1.time
    ) first_tr,
    select row_number() over () as num, inner1.time, inner1.addr, inner1.dir, inner1.lock from
        axi_if1_transactions inner1
        inner1.dir = 'WR' or
        inner1.lock = 'EXCLUSIVE'
    order by inner1.addr, inner1.time
    ) second_tr,
    select row_number() over () as num, inner1.time, inner1.addr, inner1.dir, inner1.lock from
        axi_if1_transactions inner1
        inner1.dir = 'WR' or
        inner1.lock = 'EXCLUSIVE'
    order by inner1.addr, inner1.time
    ) middle_tr
where first_tr.addr = second_tr.addr and
         second_tr.addr = first_tr.addr and
     first_tr.lock = 'EXCLUSIVE' and
     second_tr.lock = 'EXCLUSIVE' and
     first_tr.dir = 'RD' and
     second_tr.dir = 'WR' and
     middle_tr.dir = 'WR' and
     first_tr.num < middle_tr.num and
     middle_tr.num < second_tr.num
group by 1,2;

Or we can run one of our cross coverage example from post #2 , and get a coverage hole:

select distinct expected_values.burst, expected_values.dir, if(axi_if1_transactions.burst is not null, 'TRUE', 'FALSE') as covered from axi_if1_transactions right outer join
  select enums_info.enum_string as burst from enums_info,
    select columns_info.column_type from columns_info 
    where column_name = 'burst'
    ) column_meta 
  where column_meta.column_type = enums_info.enum_type_name
  ) enum1_values
cross join (
  select enums_info.enum_string as dir from enums_info,
    select columns_info.column_type from columns_info 
    where column_name = 'dir'
    ) column_meta 
  where column_meta.column_type = enums_info.enum_type_name
  ) enum2_values
) expected_values 
on expected_values.burst = axi_if1_transactions.burst and
   expected_values.dir = axi_if1_transactions.dir
order by expected_values.burst, expected_values.dir

The only piece now left missing of the puzzle is how to visualize these tables linked to a test plan, alongside other forms of coverage such as legacy SV coverage, formal coverage and similar. In the next post, the last in this series, we will explore some available options and compare them.

Tudor Timisescu

Verification Gentleman

6 年

IMO, constraints and coverage are too far apart from each other, when they model pretty much the same aspects. What I cover, I also want to randomize. This might not apply strictly in the other direction, as ideally I would like to randomize more than I cover, just to have a chance to see if what I covered wasn't enough (i.e. I find more bugs even though I reached 100% coverage). Once you have more complicated crosses that contain exclude bins and those excluded bins you also want to constrain away you're going to have a real bad time maintaining not only two views of the same information, but also across different languages.


