Latency Testing in a Nanosecond World
As components in the Network have got faster and more feature rich, the methods of testing and verifying a vendor's claims of superiority over their competitors must be tested and verified.
Many of the elements such as Switches, FPGA Feed Handlers, Servers and NICs operate in sub-200 nanosecond latency range. This article is my opinion, I invite you to share yours.
The Essentials
- Use OEM parts for the test rig from well known vendors in the space, that test and verify their own equipment before they go to market. It is tempting to go for cost reduction using aftermarket but if the results look sketchy, it will be questioned.
- Always use an external copy - Never use SPAN or measurement mechanisms included within the devices you are testing. I always use a good quality optical TAP to take a copy of the data to send to a TAP Aggregation device.
- Timestamp as close as possible to the TAP using a TAP aggregation device capable of nanosecond ingress timestamping, that ideally adds the timestamp to the header of the copied packet. That way, if you see congestion and buffer on the packet broker Network at least the timestamp is from capture point not at the analysis tool.
- Use equal cable lengths so you are working with a known quantity. We know latency for normal fibre is 4.9 nanoseconds per meter (5 nanoseconds is what most people use, but who is counting? Oh yes, YOU!!) so you can take away the latency introduced by the TAP infrastructure at each hop.
- Use Min, Mean, 50th, 99th and 99.99th Percentiles for analysis. Max is too much of an unknown in my opinion. Was it the device or a measurement error?
The Basic Architecture
Often we are pushed for a decision, and if you are not fortunate enough to have a permanent test rig for you to insert the devices being tested, then at a minimum have an external traffic generator doing the sending of the data as well as the measurement.
A few words of caution with this approach however:
- You are relying on the accuracy of the test tool. Not only on the ability for it to send the data in the pattern and interframe gap that you specify, but the accuracy of how it is being reported to you.
- These tools usually work in the 30 nanoseconds of accuracy space. Fine (or maybe not) if you are measuring a 600ns device as the margin of error is 5%, but for a Layer 1 switch that can replicate in 5 nanoseconds?
- If you are doing more complex measurement like a Raw market to normalized test, the traffic generator, unless more sophisticated, will be unable to tie back the pre-ingress and post-egressed data.
The Preferred Architecture
If you have a dedicated environment to test in, or are building one (because let's be honest, if you are playing in the nanosecond arena, you should) then something like this is what you should be testing in.
This looks complicated to measure one device, but it is necessary:
- As the traffic leaves the traffic generator, a copy is taken from the passive optical TAP north of the device being tested.
- That copy is sent to the TAP Aggregation device which on ingress applies a timestamp on (more on this later) and possibly an index value to identify the tap if there are multiple copies of the same data.
- The TAP Aggregation device forwards the packet with the added value to the tool port which is where your analysis tool is connected to
- In parallel to steps 2 and 3, the traffic is egressing and a copy is being taken from the passive optical TAP south of the device being tested.
- Steps 2 and 3 are repeated for the packet originating from the South TAP
- The analysis tool now has two copies of the packet. One north of the device and one south. Be sure here the analysis tool is using the timestamp inserted by the TAP Aggregation device for it's latency measurement.
The TAP Aggregation device
As you have probably deduced, the TAP Aggregation device is a vitally important part of the measurement. When selecting which device to use for this function a few considerations should should be made:
- The device must ingress timestamp in hardware. Software derived is useless for this type of measurement
- The devices must have an extremely accurate mechanism of keeping time. It should have a Pulse-Per-Second (PPS) input from a GPS unit (or a technology comparable in accuracy) to discipline the clock. Using PPS, we can achieve timestamp precision in the single digit nano or even lower. Or at the very least a PTP feed if PPS is not possible.
- Ideally a non-blocking architecture
There a few vendors in the market who are in this space and approach this in different ways.
The Analysis Tool
Once the data is captured, you need a tool to process the data and provide you metrics. You can probably find or even write a basic tool that compares the timestamps between two copies of the data and report on the delta between them. However, it is important the tool understands the timestamp which the TAP Aggregation is adding.
If you are looking to do more detailed analysis like multi-hop latency measurement, tick-to-trade or Market Data feed handling, there are a few commercially available solutions in the Market which can perform the task.
How do you measure?
Thanks for reading this article. The great thing about technology is it enables us to solve problems in different ways. This is just my approach and opinion, I would love to hear yours!
Network Engineering at Optiver
4 年Really good article Himesh! I would add a PPS feed to the measuring Tool to discipline its clock. Now what about the optical tap, would you go with 50/50, 70/30? Does it make any difference?
Head Of Product Management at Beeks Financial Cloud
4 年Really nice article! Thank you for posting. I think one important point to make is that you should ensure that you continue to be able to monitor the latency profile of the device (using optical taps, or similar), once it enters into production. This is because (as Perry Young has noted) you will almost certainly find that your traffic generator has different characteristics to production traffic (using a replay of actual recorded production traffic can help mitigate this). You may also find that there is more concurrent demands on the devices resources that you did not allow for in your testing. This also raises the challenge of ensuring that you have the tooling to continually monitor the latency across many different devices, and that you have the packet data storage that you need if required to dig below the tool output. Packet data storage is still required because a device vendor will, in my experience, never take the output of the analysis tool as evidence of a problem but will still take a packet capture as proof of a latency or loss problem with their device.
Ph.D. | BizDev and Sales Eng Lead at Safran Navigation and Timing
4 年Nice article! Very useful for local testing. It becomes a little bit more complex in production, but it is a matter of choosing where you want to timestamp and how to synchronize the timestampers.