Use ASN.1 and Protocol buffers for efficient binary serialisation
When every millisecond of latency matters, a great attention is put on the size of the serialized data. And if this needs to be complimented with another requirement for strong typed interface then these two beautiful formats in my opinion will not disappoint – ASN.1 and Protocol buffers (protobuf). Let’s take a brief look to each of these.
ASN.1 has been around from the very days of early computer communication, the first known usage is during the 1980s, it gained traction from the onset with a first adoption towards the start of 1990s. In the air traffic and telecommunications world, ASN.1 has been the first choice for most of these years now. The standardization of ASN.1 started to pick up in early 1990s and the ASN.1 tools space matured soon after. You could say that by mid 1990s, ASN.1 was very well established as a standards and for the maturity of commercial tools.
From syntax perspective, ASN.1 offers a very rich syntax. It has inbuilt primitive types. In addition, one can create own types by use of SEQUENCE, SET and SEQUENCE OF. The syntax offers advanced capabilities like AUTOMATIC TAGS, possibility to mark the attributes as OPTIONAL and a rich way to manage exceptions. The syntax is very clean and offers no ambiguity. In addition, ASN.1 offers extensibility by syntax thereby supporting forward and backward compatibility for most of its types (extensibility markers needs more expertise with unaligned PER!!). To use ASN.1, one needs to use one of these ASN.1 tool called ASN.1 compilers to generate the language specific classes (e.g. C, C++, Java, Python) called concrete syntax. The concrete syntax is nothing but a transformed class files for the specific language. ASN.1 Compilers supports a number of languages, there are quite a few options out there. At transfer time, serialization of ASN.1 is attained via several encoding rules like Basic encoding rules (BER), Packed encoding rules (PER), XML encoding rules (XER) and few more, some more verbose than others. ASN.1 supports precise bit patterns during serialized transfer to represent values in data structures. The encoding is based on the concept of T(Type), L(Length) and V(Value). The TLV is added to the serialized data along with a tag which represents the type. The rich syntax and the encoding rules give the run-time efficiency on the wire.
Now let's look at Protocol buffer. It is more known as protobuf.?This came as a new kid from google around 10-12 years back. Not many would know that protobuf stemmed from ASN.1, so it inherits quite a number of its features. Like ASN.1, protobuf is structured and offers primitive datatypes. The definition is via a file with a .proto extension. Like ASN.1, protobuf exhibits the same characteristics as extensibility and language independence to carry serialized data in a forward and backward-compatible way. An attribute in the proto file can be assigned with one of the following tags - required , optional or repeated. Again, there are proto compilers to transform the definition syntax to a concrete language specific structure. Like the ASN.1, ?at runtime, the data is encoded. The encoding is done in few wire types in a 3 bit tag value followed by the encoded data itself. The proto3 is much improved for performance as ir also can encode the repeated tags.
I know some of you would think of XML & jsons too for the serialized data, however these are either less strongly typed or lack the performance characteristics and hence I excluded them in this article.
In my opinion, both the ASN.1 and protobuf are capable to create highly complex message structures. The dilemma always is about when to use protobuf and when to use ASN.1. I was curious myself so I am working towards a full comparison now with examples which I will publish in my next article. Until then, if you curious about ASN.1 then I could recommend these resources - https://www.oss.com/asn1/resources/books-whitepapers-pubs/asn1-books.html
领英推荐
???
?