Every Silver Lining Has A Cloud - Part 1
Photo by Pixabay (Pexels.com)

Every Silver Lining Has A Cloud - Part 1

Cloud Architecture and Security

As technology leaders we are always looking to increase reliability, scalability and reduce some of the more mundane tasks which inevitably come with running a large datacenter or infrastructure.

From the first use of the cloud for testing software and systems to the complete migration of internal systems to large managed datacenters, the evolution of the cloud as an environment continues to move leaps and bounds. Provision of key infrastructure and services beyond basic hosting makes a key requirement of a system a few clicks away rather than setting up huge bank of hardware.

As the move from an internal datacenter to the cloud, what was once a tightly locked down system of infrastructure has now changed to a shared responsibility model where every aspect of the environment needs to be scrutinized and trust needs to be established.

In this multi-part post, the whole move and setup of the cloud is explained and broken down. The devil is in the detail – as many companies recently have found out to their detriment.

If you haven’t read my post “SaaS Design: Are you Single, a Schema, or in the Mix?” and you want to take a look at the various configurations for a SaaS initiative at the technology level, give it a look.

It Is Simple… Isn’t It?

The concept of moving to the cloud at first seems a simple one, just move the servers and the assets to the cloud. If you are a big enough customer, the cloud provider will send a truck loaded with disks and high speed fiber channels and perform a copy virtualizing your systems.

But what about security, encryption, resilience? What do I need to do? How much is enough? If there is a security hole, how do you know? and how do you plug it? Cloud providers give the tools to protect the environments but if these are not known to you, how can they be used. Cloud providers cover themselves, you are left exposed. Just try looking at any cloud provider console or web page, you will be completely overwhelmed.

Remember your datacenter on premise is likely to be of little interest to a bad actor but now you have put your precious systems in a huge datacenter with a whole bunch of other companies. Some of those companies may be exploiting the power of the datacenter to hack the very systems running on it.

Breaking Down The Problem – Before the infrastructure.

Breaking down the problem into manageable size chunks with clearly defined goals allows us to think carefully about the problem at every stage. First, let us deal with the overall setup.

Before setting up even the infrastructure, think about laying down the foundation of a long term structure for your systems. You would like separation of your production environments from other environments like staging and user-acceptance testing, and an easy mechanism of allocating people to those environments. Ideally some central management of the user base to handle departures, new joiners etc. People join and leave and are permissioned appropriately.

Foundation Considerations

We want to build an environment that stands solid in terms of security. The following sections represent a list of all foundational items that should be in place to allow your systems to operate successfully and securely. These are also useful tools for the whole environment if you are securing your ‘virtualized datacenter’ within a cloud provider.

Organization structure

This is not related to the corporate structure but moreover on the organization of systems at its most basic level within the environment. Remember in your old environment of an on premise datacenter you would have clear labelling and layout within the physical datacenter environment. In the cloud there is no control over this. So a logical organization mechanism where you can start from the top and work down as the business owner and have people join to organizational structures is key. The basic setup would be something like this but further hierarchical segregation may be needed in large environments. The approach however, is still the same.

No alt text provided for this image

So now we consider the setup of an organization. Your setup may be more or less complicated than this in terms of access rights and environment but by using the approach here you go a long way and in most cases, you have enough protection in the infrastructure for hosting.

Let’s look into the organization structure and the services required to make sure that the infrastructure is setup correctly and secure. A look at one of the organizations is all we need since the same is replicated across each one.

The Infrastructure

Before deploying any resources into the cloud it is worthwhile spending time on securing the infrastructure in the form of networking, load balancing, firewalls and subnets. At each level we want to make sure that the environment is secure, has clearly defined entry and exit points so that once setup we can add any number of resources and they are all protected by design.

The services

All services which include server instances need to be secure by design to make sure that the systems are off to a good start. Given your systems are joining infrastructure which in itself is as secure as it can be, you can sleep safe at night and in the event something happens, the tools are all present to identify the problem. In addition, monitoring adds to your toolset to diagnose and reduce any problem and get the environment up and working again.

But Infrastructure needs to be accessed!

So we lock all the environments down but if we do there will be no access which of course is a perfectly secure system. What we are trying to do with all our infrastructure setup is to put in checks at every level to make it difficult or impossible for a bad actor to access your systems. If they do gain access, we need alerts to tell us. So we need a checklist of items, implement those, monitor and improve over time.

The Big Picture – Your checklist

So let me present a big picture of the overall setup for an organization. Then, in subsequent posts, I will go down into the detail as to how the various aspects of the system protect at various levels.

Here is a list that will give you a good start as to what you will need. Some items in the list may not be applicable to you but with the description given here, you can make an informed decision for your own environment.

Network Checklist

Secure DNS: Since bad actors are becoming more sophisticated, knowing that the site or service you wish to connect to is important. A secure DNS gives a mechanism to make sure your URL is going to the right place.

Global Content Replication: This is a service which globally replicates your static content (which in the case of single page applications like angular) and gives a faster load time as content is downloaded from a local cache. This may or may not be relevant to more legacy systems.

Firewall: Probably the first thing people think of when they want to secure a system but a good firewall should be able to at least do the following. Logging all request IP’s, ability to log the country of the request, check content of headers, presence of database code, presence of a script which could be malicious (cross site scripting), throttling and blocking to remove denial of service attacks, real time metrics.

Routing tables: Configuration of routing tables allows the movement of traffic from one network to another provided is satisfies some criteria. For example you want to route HTTPS requests from one public network to a private network. Internal networks normally have a IP range that is not routable to all traffic within that network stays in that network and the routers make sure that only specific traffic passes.

Load Balancing: This is usually the public facing service for your systems. The load balancer will accept traffic and route it to the least busy systems (or some other criteria). It can also be used to spin up new instances to add more firepower to the system and handle every increasing loads.

Infrastructure Checklist

Logical Organizations: A logical organization mechanism to manage all cloud accounts, resources, policies, configurations, security, audit and resource sharing under an umbrella. This is for our various development, staging, pre-production and production environments.

Single Sign On: Ability to interface with some existing single sign on environment which will in turn give appropriate access to staff. For example you may have Active Directory under the Microsoft Windows operating system and you want to create groups and have them give rights on the cloud environment. This is optional but makes the whole management at scale a lot easier.

Identity and Access Management: This is a mechanism that allows the principle of least privilege to be incorporated against all the services that are consumed. The approach is that this can be a ‘console’ level access for users that perform specific tasks but can also be used by systems and services to perform specific tasks. In addition, each identity then has only those permissions needed to perform the task. While this could be undertaken by one master account, the obvious problems are immediately visible. The other interesting benefit of using this kind of mechanism is that other services can cycle sensitive passwords for system to system connectivity therefore removing the human need to change passwords and maintain healthy security.

Monitoring & Notifications

System Configuration Monitoring: A service that enables the assessment, audit, and evaluation the configurations of your infrastructure. The idea is to monitor all the configurations and then alert if anything changes. This is especially useful for monitoring that no person has changed something which may have bad knock on effects. Imagine someone making a change to allow Remote Desktop connections from anywhere. Unless there is monitoring, you wouldn’t necessarily be aware of the change.

Threat Detection: A service that continually monitors the accounts used to connect to various internal systems, workloads and access and looks for any malicious activity and reports that for analysis and remediation.

Event auditing: A service which is used for risk, compliance and operational auditing. Any action taken by a user, a role or a service on the system is recorded. This allows you to trace any problem and take appropriate action and look at the problem over time as events can normally be logically joined or grouped together.

Security Consolidation: A consolidation service which allows the consolidation of configuration monitoring, threat detection, event auditing and other services consumed. The consolidator continually checks the systems and services based on industry best practices and generates findings which can be ordered and handled on the basis of severity, exposure etc. It is no good having lots of services protecting your environment if you have to go around multiple systems to perform the risk assessment.

Code Monitoring: Within your software code, you may want to incorporate a service which on any error (or severe error), the information about that error is communicated to the security consolidation service or other monitoring service. Since severe errors may be the result of some bad actor trying to gain access or test your system, it makes prudent sense to add. The service will normally support a number of different programming languages and it is worth putting in the extra work to get a holistic view of the health of all your systems.

Basic Monitoring: Basic monitoring, real time stats of the systems for example CPU utilization, disk monitoring, network monitoring. This should all be fed into the security consolidation system so that alerts can be made for heavy CPU, networking etc. which may indicate a problem with the systems.

Service Monitoring: Within any system there are internal services which need to be running, everything from the web server services, to database services to specific services which form the essential functioning of the product. The need to monitor these services is vital for the smooth running of the product. If they are not then an alert needs to be sent.

Alarms: Threshold checks for all the services and monitoring need to be created and in place. For example an alarm when the CPU is more than 80% for more than 5 minutes.

Notifications: So you have all the monitoring tools, dashboards etc. but you are not going to watch these all day unless it is your job. For senior staff, an efficient notification system in place to notify the user or even a system what events happen is key. Imagine getting emails when something is out of the ordinary, in the absence of these mails, the system is monitored completely automatically.

Storage Checklist

Secure storage: Sometimes you want to make some output from your system available to the outside world in a controlled manner within a secure filesystem. We need to make sure that the underlying physical hardware is encrypted so the infrastructure provider cannot access and then on top of this, make secure links (which could expire for additional protection) for your end users to consume the content.

Secure filesystem: For the reliable running of systems especially at scale, there will need to be a common filesystem to store content used by servers, backups etc. These can be anything from configuration to database backups, encrypted both at the physical level and also on the storage mechanism. This ensures that any access of those files would need multi-level security authentication and decryption mechanisms.

Image Repository: For promotion of systems as they are developed through to production, a repository of systems is needed. This will allow these specific systems to be spun up and maintained. Post the change of the software, new images are created and old ones retired. This is the same for virtual machines or for containers like Docker.

Security Checklist

Encryption In Transit: Consider now that you will need encryption in transit and for that, you will need to establish secure transport level security. This security normally relies on the use of certificates but these expire and are in some cases, cumbersome to renew. Clearly with a large setup there needs to be some automated mechanism. Yes there are services like Let’s Encrypt but some other transparent system needs to be explored if available.

Secrets Management: No username or password should be stored on any server. For example, storing the database login credentials on a web server would be a problem if someone gained access to that server. It is better to store all the connectivity credentials in a secure vault and then only certain services or groups can consume these sensitive credentials. You may think that this is simply shifting the problem and that if credentials are needed for the secrets management then this defeats the purpose. However, this is not the case, there are no login requirements needed if the server or service has been granted the permissions to read the credentials. A person logging onto the system cannot read the credentials but a system can.

Virtual Private Cloud: This is the collective set of functionality which allows all your resources, services and systems to be launched and maintained in an environment that is your own like in your own private datacenter. This allows the creation of subnets, routing, gateways to allow safe access to the internet for your services and other services.

A Lot To Think About

From the list above, where you do start, how do you cover everything? This is the purpose of the next post where we delve into specific aspects of the setup of these systems. By the end of this series of posts you should have a good solid understanding and then be able to apply it to your favorite cloud provider or on your own private cloud.

Until then I will let this all sink in…

Alexandr Livanov

Chief Executive Officer and Co-founder at 044.ai Lab

11 个月

Colin, thanks for sharing!

回复

要查看或添加评论,请登录

Colin Stone的更多文章

  • Every Silver Lining Has A Cloud - Part 4

    Every Silver Lining Has A Cloud - Part 4

    Scaling for business, scaling for cost This fourth and penultimate installment of this set of posts concerns the setup…

    1 条评论
  • Every Silver Lining Has A Cloud - Part 3

    Every Silver Lining Has A Cloud - Part 3

    Databases, Storage and Filesystems. This is the third installment, and we continue our journey and look at databases…

    1 条评论
  • Every Silver Lining Has A Cloud - Part 2

    Every Silver Lining Has A Cloud - Part 2

    Infrastructure Setup In the second part of this multi-part series on the cloud, we will continue our journey and look…

    1 条评论
  • SaaS Design: Are you Single, a Schema, or in the Mix?

    SaaS Design: Are you Single, a Schema, or in the Mix?

    Introduction As CTO’s and technology heads, when we are thinking of designing a new system in a SaaS environment there…

    1 条评论
  • Ditch the Aging UI not the Aging App

    Ditch the Aging UI not the Aging App

    There is no excuse for poor user interfaces – even in legacy applications. Let’s consider how we can revitalize, and in…

    1 条评论
  • Componentization: Expanding the Digital Presence of your Research Organization

    Componentization: Expanding the Digital Presence of your Research Organization

    Investment research organizations have to work harder for their money. Add to that the compliance and regulatory…

社区洞察

其他会员也浏览了