Every Silver Lining Has A Cloud - Part 2
Colin Stone
CTO | FinTech Entrepreneur | Solutions & Systems Architect | Global People Leader
Infrastructure Setup
In the second part of this multi-part series on the cloud, we will continue our journey and look at the setup of the environment in which our systems will run. This is basically expanding on our checklist and turning those items into reality.
We will consider the production organization from the first post and go and configure the setup. Then we replicate for all other organizations.
For all such tasks detailed in this section, a suitable scripting language should be employed to ensure that the setup can be replicated, and we have the added advantage of being able to create all the other organization infrastructure items by simple changes to a configuration file. A suitable language is Terraform which can be used to automate the process. It is simple to use and there are plenty of resources on the internet to facilitate learning and training. For Azure, Azure Resource Manager can be used.
Are You Redundant?
You are paid for the services you provide so having the appropriate guarantees on uptime from the services that you consume is vital. Therefore, it is important that within your setup you add redundancy wherever possible, but given adding redundancy is time consuming (for that infinitesimal time the main systems are down) we will try and keep it as painless as possible. The goal is to have your systems in at least 2 physically separate data centers but from networking point of view are all part of our network. We will let the cloud provider worry about the connectivity!
Virtual Private Cloud – Public But Private
A Virtual Private Cloud (VPC) is a logically separated section of the cloud provider which is reserved for you to launch all your resources in a private network for you. Think of this as a traditional network in a data center but with the added advantage that you can add more and more services to this structure in minutes and, as we will see in later posts, dynamic resources based on criteria.
We will create a Virtual Private Cloud to house everything and then subnets within the VPC to separate key areas and secure them accordingly. The VPC covers 2 or more datacenters so our “virtual data center” is already redundant for continuity purposes but we still have to make our systems redundant.
For the purposes of our setup will assume we have a database subnet, a subnet for internal load balancing (this is for allowing load to be distributed for internal services) and a subnet for deploying the front facing application. We deploy the same setup in the second datacenter on the cloud provider to give the redundancy. Therefore, we are going to create a total of 6 subnets. Think of these are rooms which have access control mechanisms to decide who enters and who doesn’t. In order to allow all this setup to be useful, we also have a public subnet which is used to connect the private subnets in a controlled way to the consumer base.
It is worthwhile pausing for a second to appreciate the need for automation here. Even though these are typically one-off exercises, we need to ensure that our other organizations (see figure above) are configured in a similar way. the figure below shows how the whole setup looks logically and physically.
You can think of the logical separation now as anything private is not available on the internet, anything public can be. We need specific routing now to route the information from one place to another.
Accessing the Secure Areas
There is a need to give the secure areas outbound access to services. For example, time services for clock synchronization, update mechanisms etc. These are achieved by the use of network address translation gateways (NAT) like your home ISP. This allows outbound traffic to flow from inside to out, but no outside entity can initiate an inbound connection.
In order to support the systems, we will need highly controlled inbound traffic. We can use a static IP and a suitable virtual machine to provide a jump box (a system which, for the purposes of servicing we can connect or ‘remote’ into and that system can access the secure areas). This jump box needs to be secured by multi-factor authentication and generally only accessible from the organization hosting the systems in the cloud.
From Internet to Public
Your systems now need to access the Internet so your customers can connect. Your public subnets are primed for access, we now need to connect them so end users can consume your services. For this we use an internet gateway which enables those resources in your public subnets to connect to the internet provided they have an IP address. An Internet Gateway ensures that all traffic goes through the gateway and thus, we can control the flow rather than all public systems just connecting indiscriminately.
From Public To Private
From public subnets to private subnets, we use routing tables to define the rules for traffic movement. We use public and private routing tables to achieve this result. We can in addition, define protocols for the movement of information. For example, we can close ports so that only 1 port is open on the database subnet for traffic, and from the public to the private subnets we only allow the approved protocols. This allows the ability to get very granular and reduce the attack surface of any system.
Additional Protection Layers
In addition, we can also implement network access control lists which is an optional layer of security for the VPC that acts as a firewall for controlling traffic in and out of one or more subnets.
Monitoring is also key in this area and cloud providers have logs which can be integrated into dashboards to show the aggregate flow of information in and out of subnets. Anomalies can be identified either through alerts, or constant monitoring dashboards. Remember these items will be invaluable as you grow, and your systems become more of a target.
Automatic Protection by Design
If you have followed me so far then what we have is an infrastructure environment which has some great qualities to allow us to build a solid protected environment even when we cannot be monitoring day to day what staff are doing.
For example, let’s say a junior member of staff brings up a new server or other resource into a private subnet. By design (and because the security is controlled by someone else) the server even if unprotected (i.e. the junior staff member has done no hardening/securing of the server or has an admin password of password1) the rules of the subnet will be followed. The control and segregation of duties, keeping the critical areas like networking, firewalling and access control with senior staff, the remaining work can be done by more junior level personnel without impacting the overall security of the system.
Psst!, Something to Think About
During my time working with 3rd parties who are experts in this area, I have seen that there is a lot of work done on securing the system from the outside world but let’s say that a bad actor has got access to your systems within the private subnets. Currently there would be (as shown above), an internet gateway to allow traffic to go out from the system to anywhere. Think of it as a safe, it is protected from people trying to break in but what about trying to break out – little thought may be given to this.
Now, imagine a bad actor gains access but your private subnet NAT gateway has additional settings on its routing (remember routing is controlled at an infrastructure level) so the only places your internal servers can go to is the time server and automatic update server, and that is it. If someone does get access, they cannot go anywhere, they cannot transfer content because they cannot connect to anything except the time servers and the software update services.
Consider the database server, it has an inbound connectivity to allow the application server to connect and query information. The database server is configured to have no outbound traffic allowed except time services and update services. If a bad actor gains access to the database server they cannot connect anywhere.
Now let’s consider the application server which connects to the database server. We perform a comparable setup. We allow incoming traffic say on HTTPS and on the outbound, nothing except time services and update services. A bad actor cannot transfer the data to anywhere because they cannot connect to anywhere. If information is being leaked through normal connectivity i.e. via the standard interfaces or API's, the sheer volume would be an alert and would flag up on the dashboards. Additionally, if throttling has been implemented the risk of accumulation of large quantities of information by a bad actor is greatly reduced.
Firewalls
When you think about a firewall what typically comes to mind? See which one fits your view and be honest in your selection. Also, which one of the 3 examples is correct?
领英推荐
Firewall 1:
A firewall is something that we put in place on the network to stop all incoming traffic protecting us from the outside world while still allowing us to browse the Internet, file share and communicate over collaborative tools.
Firewall 2:
A firewall is something that we put in place on the network to stop all incoming traffic except the specific protocols that we allow so that we can connect our VPN’s for work from home.
Firewall 3:
A firewall is something that we put in place on the network to stop all incoming traffic except specific protocols or specific locations. It protects applications or API’s from common web exploits and bots that may affect availability or compromise security or consume excessive resources. It also blocks common attack patterns.
So Which Is Correct?
The correct example is that they are all correct. If you are a company that doesn’t host anything internally, then a simple setup which is reality is nothing more than a NAT translation (Firewall 1 example) will suffice. This is typically your setup at your home with your ISP. No one else on the ISP network can see your systems but you can get access to all the systems on the Internet.
If you are a company that allows work from home then you would generally all into Firewall 2 category where your setup allows for incoming traffic on a particular port. The firewall will normally contain a client software component to establish a connection but except for that port, you will be invisible on the Internet.
Firewalls for SaaS and Cloud
Moving to SaaS and cloud, Firewall 2 is still an option and quite honestly, doesn’t really need a firewall. A simple setup of active control list would suffice. For example allow all traffic on the HTTPS port but for Remote Desktop only allow from a specific IP address. This would limit the attack surface risk to either HTTPS which is low, or Remote Desktop which is high but you would have to physically be in the correct office to gain Remote Desktop access.
A Real Firewall for Real Guarantees
Now when we refer to Firewall 3, we are talking about managing a SaaS offering. We want to make sure that resources are shared reasonably equally, protections are in place, monitoring is available with heuristics to at least flag up potentially bad incoming traffic. We would also like the traffic to be examined for Cross Site Scripting, SQL injection etc. As an extension for monitoring, the firewall should be able to send statistics and events to a common dashboard to aid support personnel.
Firewall Rules!
As part of any solid infrastructure setup let’s define a few rules we would like to implement. Remember that the firewall is going to prevent excessive traffic from bad actors hitting your load balancer and crippling the entire system. It is also going to filter out potentially bad information in traffic.
Administration Protection Rules
In all systems, there will be a number of administrative pages and we would like these only to be available to certain individuals but let’s imagine that the system which has been permissioned for the pages is compromised itself, or the password of the admin user has been hacked, or there is a brute force attack going on.
So, we configure rules on the firewall that alert or sometime stop requests if some criteria is met. The easiest way to do this is to count all requests to specific URL’s or destinations. Given potentially thousands of requests per second may be coming into the firewall, checking then counting then actioning then alerting is key. Going forward, all that is needed is any system deemed administration is added to the managed administration rule protection set and the same level of scrutiny will follow now and in the future.
IP Reputation Rules
This can be thought of as a managed service or a service built over time based on actual activity on the network. It will block requests from IP addresses which have typically been flagged as being associated with bots or have had historically bad actors in the past. In addition, it can also add a layer of manual authentication for these IP addresses. For example, the need to enter a captcha on each request to the server for those bad reputation IP addresses.
Common Rules
This is a wide-ranging protection against a set of vulnerabilities including ones described in the OWASP (Open Web Application Security Project). A simple search on the Internet or going directly to owasp.org will show the top 10 problems. Such rules need to be in place to mitigate these risks.
Here are a few items to check:
Bad Rules
This is a set of rules which block request patterns that are known to be invalid and are associated with the exploitation or the discovery of vulnerabilities.
SQL Rules
This is a set of rules which block request patterns associated with the exploitation of SQL databases like SQL injection. This can help protect against remote injection of unauthorized queries. Even if you are confident of your software, a solid set of firewall rules can give extra piece of mind.
Operating System Rules
This is a set of rules that block and prevent the exploitation of vulnerabilities that are specific to the operating system hosting which could allow the bad actor to execute commands and in the worst-case, commands with elevated access.
Until The Next Time…
So what have we learnt? We have learnt that not all firewalls are equal, and that a secure setup at the infrastructure level and tight setup can ensure that over time, as systems and resources expand, they are protected by the foundations set right at the beginning.?
As technology leaders, we want to know our systems are safe but, we cannot do everything, we have to trust and delegate. But those people we delegate to may not have the full grasp of the security requirements, they may be developers wanting to get their new product up and running to showcase or scaling up successful systems. That scaling up may put them on the radar of bad actors which, when small, would not have been the case. But, if the house is secure, then anyone inside is also secure.
Next time we will look at storage mechanisms from databases to file systems, how these are secured and made redundant.
Until the next time…
Chief Executive Officer and Co-founder at 044.ai Lab
8 个月Colin, thanks for sharing!