How Trust Works On The Internet

How Trust Works On The Internet

I concluded the previous article in this "Gentle Introduction to Cryptography" series (solving the message authenticity problem), by highlighting the importance of trust in human existence. It is not easy to notice, because just like air, it is everywhere, we interact with it all the time and we are biologically and psychologically so tethered to it, that it only becomes obvious when it is missing. Like air for individuals, modern society could not function without trust, as the vast majority of actions that we perform are based in some shape or form on it. Just think of this very moment in your life, when you are reading this article, how many things you actually take for granted: How do you know the battery in your device will not explode (it does contain Lithium after all, a highly combustible substance), if you are plugged in, how do you know the electricity in the socket is the right type, how do you know it will not change after a few moments, how do you know the building you are in will not collapse, the cup of coffee/tea you are enjoying is not poisoned, and more. Now think about other trivial actions, like crossing the street, or driving a car, how many times you assume things to be exactly how you expect them to be without actually verifying them, which is in a practical sense, is the opposite of trust, the verb.

Without trust, we could only do very, very elementary things, even eating and drinking would be reduced to products that we have consumed before, found in their natural form and environment. We couldn’t eat anything that is new or from other people, as that would represent a potential existential risk. As individuals, we have very limited capabilities to verify things, because sophisticated verification can only be done with complex instruments. This would inevitably require trust (trust in the manufacturer, and scientists that built them to say the least). But there is an even bigger reason. Most effects are only detectable post factum, which is why monkeys never eat the red berries, unless they directly see other monkeys eating them. Bright colors in nature are a telltale of poison, the presence of which only becomes evident if poisoning occurs and thus impossible to verify by oneself (on oneself), so monkeys - being as smart as they are - will use another subject to verify it - a monkey eating the red berry, means it has done that before, and given it is still around, the berries should be safe to eat. Depressed monkeys committing suicide by red berries is a sufficiently edge case in nature to not impact or disrupt the development of such behavior, but should this be prevalent, red berries would be eternally safe from monkeys since they would never consume them - they would become unverifiable. Such unverifiable processes would grind the evolution of societies to a halt as they are not merely difficult to overcome without trust, they are impossible to solve.?

***

The internet is a digital space, that in itself is not so much a direct producer of verifiable phenomena - not yet at least. However, as a medium of communication, a forwarder of information from one place/time to another, it is essential for it to be able to forward the verified or trust status of a real-world producer to a real world consumer throughout its chain of nodes and processes, which in itself is a problem of second degree verifiability and trust. If you trust that your thermostat is correctly getting the temperature and adjusting your central heater in your home, and you are reading it across the Internet, you also need to trust/verify that the Internet itself is not altering the values while they get from the thermostat to your mobile phone. As seen in the previous chapter, message integrity, this is solvable if we trust the hashing mechanisms and digital signatures. Indeed, the algorithms we trust, but the source of the signature we still have to verify directly. This is not only inconvenient but also very, very constraining. We need a mechanism to be able to verify the authenticity of sources (producers), as well as that of the messages, indirectly, for the Internet to scale. This capability of transmitting the verified status along a chain of nodes we need not verify directly is what constitutes the concept of trust across the Internet.


Certificates and Certificate Authorities

So far we have discussed about two important building blocks of cryptography: hashing and asymmetrical keys (public/private) and how these two can guarantee the authenticity of a document (not privacy, just impossibility to forge) using digital signatures (hash of the document encrypted with the private key and verifiable with the shared, public key). The caveat was that you had to know the public key of the source before you got the document, and you could not send that public key with the document because everything could be forged. But what if the public key itself, could in turn be signed?

Given the public key is a piece of data itself - a document, basically - we can of course use the same principles as we use on any other document, to ensure its authenticity (impossibility to forge even if publicly accessible). The caveat is in fact also the same: by the time we receive the digitally signed public key (let’s call it object key), we have to already possess the public key that matches the private key that signed the object public key (let’s call this one, the signing key). Sounds like we haven’t really solved anything and in fact we are going down a rabbit hole of keys, that are signing keys, that are signing keys and so on, but in fact that is not the case. All we need to do is recognize that there is a multiplicity relationship at play. A key only needs to be exchanged (known ahead of time) once, and then, the authenticity of any number of documents can be verified with that key. Equally, if we know a signing key ahead of time, then we can in fact exchange any number of object keys at a later time and so, a hierarchy is born.

No alt text provided for this image

The only remaining problem with this setup is that it would be impossible to know who the object keys belong to - as they are no longer directly exchanged - thus, it is necessary to also append identity data together with these keys. These two elements together form the basis of the trust mechanics of the Internet, certificates.

Certificates - Digital IDs

You have to imagine these documents as the digital equivalent of an ID card or driver’s license - the likes that some government agency issues to people - certifying that person X (whose data is in the Identity document), has the personal information and accreditation that are stipulated in the ID document, and that the information has been verified by the authority who is also highlighted on the ID card as issuer. Just like video recordings, images or formatted text documents (files), certificates have a specific format and encoding that is known to network based applications. Among other things, that define various modes of working, scope and limitations, there are a number of important, trust related elements in the certificates:

  • Certificate ID: an id number that is guaranteed to be unique within the issuing authority. This subjective uniqueness is very important to be aware of otherwise severe confusion may occur, even break in trust, due to the nature and relationship between authorities (see later)
  • Common Name: the designation of the entity to whom the certificate is issued to, like your name on your ID or driver’s license
  • Not valid before: beginning of the validity period, a date
  • Not valid after: an expiry date after which this document should no longer be considered valid
  • Public key: the public key of the entity to whom the certificate is issued to, and who in turn owns the corresponding private key
  • Signing authority: the name of the authority that issued the certificate
  • Digital signature: a hash of all the above data and more, encrypted with the private key of the signing authority

Going back to the authenticity example in the previous chapter where we identified that it was impossible to send the public key along with the signed document - because a man in the middle attacker could replace both the key and the signature and subsequently forge the document - that is no longer the case when certificates are used.

Suppose a government official, let’s call him George X, makes a public statement and digitally signs it with his private key, he can now attach his corresponding certificate to the digitally signed document safely because a man in the middle agent, who has access to the all the document and has the capability to modify anything in the message, could not do so in such a way as to fool the recipients of the message. This is true provided the recipient trusts (has knowledge of and accepts) the authority that signed George’s certificate. Let’s see why.

Suppose the attacker did intercept the message, altered it and changed the signature. The signature must be created with his or her own private key, because the attacker does not have access to the private key George signed the document with. In order to achieve perfect forgery the attacker now has to change the public key attached to the document - which is now part of the certificate - so the attacker needs to replace the certificate itself. Because it has the capability to create a certificate - anyone can - the attacker copies all the data from George’s certificate, name, id and all - basically it creates a clone, even the name of the authority can be cloned - replacing only the public key with its own. There is however one exception: it will not be able to sign the certificate with the authority’s private key as that key is not in the document, it’s being kept very, very safe with the authority. To make the certificate structurally complete - it would not be without a signature - the attacker must sign it with something, perhaps with its own private key, and herein the forgery attempt fails. The resulting certificate is not signed with a private key that corresponds to a certificate of an authority that the recipient trusts, therefore the recipient will immediately know the document and the certificate have been altered and therefore should not be relied upon.

As mentioned earlier, the prerequisite for this to work is that the recipient knows ahead of time the identity and the public key of the authority that signs certificates to potential sources, which is exactly how things are set up. These identity / public key combinations (along with the other data mentioned) are found in authority certificates that are deployed together with the operating system of your computer. Your computer in fact trusts not one, but many - in excess of 100 - authorities as their certificates are preinstalled in what is called a trust store, key chain, or key store, depending on your operating system. This closes the circle of trust.


Chains of Trust

Structurally, authority certificates are identical to endpoint certificates - I’ll call them collectively endpoint certificates as they too will break into two categories (more about this in future chapters). The only major difference - and this is a very significant difference - is that authority certificates are self signed, not authority signed. Sometimes authorities do sign each other’s certificates, a process that is called cross-signing, but it is more of an exception than a rule, for the obvious chicken-egg situation that would emerge. Because these certificates are not signed in turn by a trusted authority, the trust model between message consumers and authority certificates is direct (as depicted in the picture below), as opposed to the trust model between the message consumers and endpoint certificates which is indirect.

No alt text provided for this image

It is in fact certificates that constitute the nodes in the hierarchical structure I mentioned earlier - hierarchy of keys would not be very useful as it would lack the identity elements. Like any hierarchy, this one is not limited to two levels, there can be in fact any number of levels - the only caveat being some technological and performance considerations, but not limitations. There are two fundamental principles in this hierarchy, the first, we already talked about the authority certificates, which will sit as the starting points of the hierarchy, which is why they are almost always called “Root Certificates”. The second one is the chain of trust, the path from the endpoint certificate (the one presented by George in our previous example), which the consumer does not directly trust, through the nodes (called intermediary certificates) similarly not directly trusted by the consumer, all the way to the root, the only one that is in fact trusted by the consumer. Although not directly trusted, the consumer must have access to these intermediary certificates (so basically the entire chain) otherwise it would be impossible for him to reconstruct the chain back to the root by sequentially validating the authenticity of each node in the chain with the previous (signing) certificate all the way up to the node. In such a case, the chain of trust would be broken as the authenticity of the endpoint would not be verifiable against the root, the only directly trusted certificate.

No alt text provided for this image

Translated into a real life social scenario this would look like this: I trust that this person to be George X, because he presented me an ID issued by the City XYZ, which was trusted by my country’s government to issue IDs to people, and I trust my country’s government.

It should be noted - not only to avoid any debate about trust in governments - that this is just a hypothetical example, and in fact the nodes in the chain, and the root need not trace back to a government or any official institution for that matter. The setup and structure is equally valid in any context where a chain of trust can be established between the consumer and the root - business parties, communities, or your internal infrastructure. In fact, from a security perspective, it is much better if you work with certificates, especially root certificates that you issued or directly trust (in this case I am not referring to the direct trust model, but rather the actual direct trust - because you as a person physically trust the issuer), provided you have the knowledge and capability to securely handle and maintain your trust chain / tree. This may be a bit controversial with some security experts, and perhaps 15 years ago I would not have made this point, but nowadays there is too much tinkering with the Internet trust ecosystem for one to just blindly trust whatever OS and browser providers push into your trust store. However, this is more of a concern for business setups. The end-user (Browser) ecosystem should not be overly concerned with this matter, which is good, because the end user space (unlike the business space where pre-established relationships exist), relies almost exclusively on these common, universally trusted root certificates to overcome the lack of preexisting relationships. Needless to say, this balance is very, very fragile and any glitch can send dramatic ripples across the internet causing potentially huge gaps in security for a wide range of people or organizations as well as creating massive business disruptions, as nowadays, the entire Internet - tens of millions of services and billions of people rely on this.

In 2018 Google announced that the Chrome browser will non longer trust a certain Symantec root certificate that was used to issue certificates before June 2016. This meant that once Chrome’s trust store stopped containing that root certificate, no certificate issued in the distrusted certificate tree would be considered valid (intermediaries or endpoint). The preparation for this change lasted in excess of one year to allow certificate retailers to re-issue all the certificates to sites that were issued in that tree, to allow time for sites to update their certificates, otherwise the disruption would have been massive. In a similar burst to improve the security of the Internet, in 2020 Apple announced that it will no longer trust certificates that have a validity period longer than a year. This too, took a long time to implement but such steps need to be taken, as we are currently walking a very fine line with respect to the health of the Internet trust ecosystem.?


Attacking The Trust Chain

As important as the Internet trust ecosystem is, it is not foolproof. The mathematical cryptographic aspects are time and technology calibrated, which provides technical assurance and comfort. Cryptographic strength is considerably over-provisioned with respect to compute power and this provides a relatively safe window to adjust to the technology evolution - phase out older and weaker algorithms without disruption. That said, the chain of trust contains non technology aspects that are far more difficult to control and a lot easier to break / mis-configure / mishandle.

Apple’s decision to distrust endpoint certificates that are issued for longer than 1 year has a very human angle. Managing and protecting certificates is not a trivial task, and the longer a certificate’s lifetime, the higher the probability for it to be mishandled. One year is a golden middle of some sort - not too long to be mishandled and not too short to cause disruption, but you can see how this is just a made up time period, not to mention the period alone cannot guarantee anything. A lot more will depend on who the certificate belongs to, and what level of expertise they have. Root certificates generally have a lifespan of around 30 years during which time a huge number of endpoint certificates will be spawned from them. You can imagine the degree of security that needs to be guaranteed in their cases. A suspicion of gaps in the handling process of the root certificate in consideration was Google’s reason to distrust it. But why the fuss?

As noted, a lot of non-mathematical elements are present in the Internet trust ecosystem, among them the fact that your OS and browser will trust (based on the latest update) a great many authority root certificates. When I say your browser will trust, that means in fact that you will also trust whatever certificate is issued by any of the authorities in your OS. This is not a process that can be done manually, so we mostly take for granted that if it is issued by an authority in the trust store, it is good. Good for you as a person browsing the web, but also good for your OS to look up updates, for the OS to validate applications being installed on the system (hold that thought) and any other application that runs on the OS. Needless to say, there is a great deal of trust going around that hangs in that trust store.

A hugely important aspect with respect to that trust store and the authority certificates in it, is that there is no procedural difference between them. From a perspective of trust, they are absolutely equivalent: a certificate issued with one of the authority certificates is equally trusted as one issued by another, and there is no guarantee or information on the Internet with respect to which certificate authority should be the authority of a specific identity. Simply put, any authority certificate installed in your trust store can forge any certificate issued by any other authority in the world, and neither you, or your OS or browser would be able to detect it. From an attack vector perspective, if an attacker controlled the private key of any of those certificates it could perform a man in the middle attack with perfect forgery, because it could forge entire endpoint certificates. This is not a hypothetical scenario.

In 2017, Equifax was hacked and a lot of data was exfiltrated. What was a little less publicized - as it is less glamorous than personal information, even though I believe that it has a lot more impact - the Equifax root certificate was also compromised. At the time, all around the world, operating systems and browsers had the Equifax root certificate in their trust store. For a period of time - depending on the reaction of OS and browser providers and how keen people were to update - the Equifax root certificate was a potential instrument to perform perfect man in the middle attacks. The Equifax root certificate disappeared silently from trust stores and the damage caused by this temporary collapse of Internet trust remains unknown. Sadly, this was not a lesson that we learned a lot from. There are technologies and services provided by popular companies around the world which many enterprises use to sniff corporate traffic under the “data loss prevention” pretext. Most, if not all of these are in fact direct and purposeful acts of meddling with the Internet’s trust ecosystem. Imagine yourself hacking your fuse box to install an electric heater, then leaving it open with cables dangling everywhere. While they serve a purpose, we really should consider carefully if the means really are justified by this purpose, because such acts are scarring the Internet trust ecosystem, leaving pockets where the trust breaks in ways that are difficult to detect, let alone control and which serve as a powerful fuel to cybercrime.?

No alt text provided for this image

Supply-Cain Attack

I mentioned earlier that operating systems use the Internet trust ecosystem to validate applications they install, which is one of the most pure and prevalent use cases for integrity and authenticity: applications should be accessible by anybody, so the installer file (the document) itself should be accessible publicly (not encrypted as privacy is not an issue, it would actually be an impediment) but at the same time, you want to make sure that you are installing a messaging application, for example, on your laptop and not a keylogger, or a ransomware infected arbitrary application (which is quite a common theme). The way this is done is quite similar to the way described above:

  • Authority issues certificate to application developer
  • Application developer signs the installer with their private key
  • You download the application and run install
  • Your OS trusts the file, because it trusts the authority that issued the certificate to the application developer company
  • Application installs and you have a guarantee that the source of the file has not been altered between the app developer company, across the internet, through download sites, CDNs and other untrusted environments it may have traveled until it got to your computer

The principle is great, unfortunately things are not as smooth as they could be. In fact, statistics show that more than 66% of malware that circulates on the Internet is digitally signed with a trusted authority issued certificate and of those, according to VirusTotal, in 87% of the cases, the code signing certificates issued to the developer company were still valid (not revoked - I’ll explain below) at the time of analyses. In other words the developer companies lost it, and did not know they lost it and as a consequence did not notify the authority that the certificate was compromised.

The problem is, that even one certificate like this is enough to wreak havoc over the Internet, because anything signed with it will be trusted blindly by any OS - this is the Internet trust ecosystem promise - and as such it opens the path to perfect forgery as explained above: attackers hijack communication flow in the supply chain, infect the executable file (change the document essentially), forge everything, re-calculate the hash, sign it with a stolen certificate and send it on. Your computer does not know that Messenger X is supposed to be signed by Certificate of Company X, so criminals can very well sign Messenger X with certificate of Company Y which they’ve stolen, because your OS will trust it just the same.?


Some Lingering Trouble in Paradise

I hinted earlier, on several occasions, that certificates have a validity period, just like your driver’s license does and when that period ends, we say the certificate expires, and like your driver’s license, it loses validity - as in, you can no longer use it, or it is no longer acceptable for the purpose it was issued for.?

At such a point in time, concerning your driver’s license, you will have to make a visit to the issuing body, and ask them to reissue your license. They in turn will perform some verification - usually not as strict as when they first issued the license - to make sure you are still OK to drive a car. This verification, together with that initial verification is part of a promise the entire concept of authority rests upon, even though it is not a technological element. When you see that somebody has a driver’s license, you have no reason to doubt that they have good enough driving skills to be able to drive on public roads. This promise was very much standing in the beginnings of the Internet as well. When certificates were issued and reissued by authorities, the latter made sure that the entities certificates were issued to, had legitimate claim to the internet domain names the certificates were issued for, and ensured that some accountability existed with respect to that web presence. Unfortunately that is no longer the case - I will explain this later when we get to the topic of securing communication for web sites. The point I’d like to make here though, is that nowadays certificates are issued to anybody on the grounds that they own a domain. That is sort of like the police issuing you a driver’s license because you own a car. That is not what you, pedestrian or driver - consumer of the roads ecosystem - expect from all the drivers that surround you. I recognize there is a complexity that comes from the sheer size of the Internet, but unfortunately, Internet users are not aware that there is this big crack in the trust ecosystem of the Internet between what is assumed to be (the promise), and what actually exists (the service). This is why phishing attacks work so well and will continue to work unfortunately, as big players on the Internet are in no hurry to plug the hole.

Another tricky situation emerges when some incident occurs such that the validity of a certificate needs to be removed during its validity period. In the case of a driver’s license, the police will simply withhold if someone grossly violates the rules, but what happens for example if you lose your passport? After you declare it lost, the authority will mark it invalid in their central system which is connected to all the customs such that any time a passport is used to cross the border its validity is verified against that central database. Similarly, certificates may also suffer certain events - like as being lost or stolen - that would compromise their validity. In such situation, we can and should notify the authority to make it invalid - a process that is called revocation.?

You may have noticed that there is a bit of an operational complexity here. While the time period validity of the certificate is written in the certificate itself, so it can be verified against nothing else but the certificate - right there on the spot - a revocation is a much more complicated situation. The revocation cannot be written into the certificate, as they are either missing or potentially copied into multiple (an unknown number) of places. As is the case of a lost passport, at every single use of the certificate a background check is necessary to ensure the certificate is still valid. And not only that, but the validity also needs to be checked for all authority certificates in the chain - intermediary and all the way up to the root. This of course takes time, and while this delay is nowhere near the length you need to wait for a background check at the border, systems are often optimized for user experience and skip this verification altogether, or perform it on a sampling basis. While events such as the Equifax breach - where root certificates are compromised - happen rarely, they can have a profound effect because they are amplified by the industry not treating the Internet trust ecosystem with the care they should.??

It is important to understand that these are only procedural, not a mechanical faults. What I mean is that there is nothing wrong with the technology here, the shortcoming lies simply in how we use the technology, or more precisely on how we neglect important social components of the architecture of the Internet trust - the very human aspect of broken promise.

***

I'd like to highlight as a take-away - besides how trust work across the Internet - that as we go deeper into the discussion around cryptography, a pattern is emerging that shows how things are built one on top of the other, like construction elements in a wall. This doesn't stop here. Other elements will come to further depend on these building blocks, and subsequently the higher level elements that are built upon them (like Internet trust). The result is complex and not at all robust - it does not tolerate losing elements in its supporting structure. Architectures, in general, depend not only in the elements that comprise them, but also in the way they interact. We'd be doing ourselves a favor if we did not miss that aspect.

As always, I look forward to hear your comments and feedback. See you in the next year on the subject of how to ensure the privacy of communication.

要查看或添加评论,请登录

Stefan H. Farr的更多文章

社区洞察

其他会员也浏览了