Why creating a single or limited global AWS S3 bucket in an organization make any sense?

When it comes to architecting and creating an AWS S3 or Amazon's Simple Storage Service bucket in any organization especially in large corporations usually we start with a single S3 bucket or rather with a limited number of S3 buckets depending upon the various Business Units (BU) or Verticals says Finance, Manufacturing, etc in a company.

Typical S3 Bucket Diagram


It is always a good practice to limit the S3 bucket for various good reasons and keep the data centralized.

Especially for a growing company like Startups, if we start restricting the number of buckets from the beginning itself then it will be very easier to manage the data if we look at the long term scope. And eventually, it becomes one of the best practices that the employees within the Startup can follow.

Just think of a scenario where different teams within a Startups have started creating their own S3 buckets for various project-related work and then comes to a point later where it becomes too difficult to move out to single or limited bucket architecture. This will lead to both operational & maintenance costs.

As a good practice, a company can create a single, global, unique S3 bucket for different environments say sandbox, development, QA, pre-production, and production. And within that bucket, various Engineering teams can get access to create/read/write data from their own restricted prefixes (or sub-folders in the layman terminology). This will give more control to an organization over their data as it is kept centralized and secured. As a good practice, we can also enable a separate centralized logging bucket to keep track of every activity performed on that centralized data bucket. We can also enable various properties on that centralized data bucket like versioning and data replication as a backup for the disaster recovery in other AWS regions.

A classic example would be a bucket like in this format as <bucket-name>-<environment>-<region>-<AWS Account id>

global-bucket-sbx-ap-southeast-1-088853283839

No alt text provided for this image

As shown in the picture above, the export name value is global-bucket which can be imported then into any other CloudFormation templates using the !ImportValue global-bucket

Two Buckets with Centralized Logging and Centralized Data

Note: Both MyS3Bucket and LoggingBucket as shown in the above picture are just logical names only. The actual name of the bucket gets resolved based on your environment and account id. For instance in this case as global-bucket-sbx-ap-southeast-1-088853283839 and global-loggings-sbx-ap-southeast-1-088853283839.

Keeping a single bucket has the following advantages:-

  1. Operational cost
  2. Ease of maintenance
  3. Ease of restrictions

Lets' discuss all points one by one.

  1. Operation cost - As a service, S3 itself is free and it scales automatically when it comes to the storage of large volumes of data however there is an operational cost involved in it when it comes to data-in and data-out from a network and of course with other key parameters. Keeping the data scattered in various buckets leads to untracked operational costs (network transfer-out usage).
  2. Ease of maintenance - No need to worry about the maintenance of various buckets as your DevOps teams have to manage a single bucket only. As the company grows, your DevOps team can give access to the development or product team to specific prefix within a bucket to read or write data into it by giving the bucket level access. You can write a single Infrastructure as Code (or IaC) template file (JSON or YAML) for a centralized Bucket creation with Outputs as export value and in the future, your any development team can refer that Export value in their IaC code (AWS CloudFormation or AWS Serverless Application Model). One can also, implement the life cycle management on this bucket to save cost.
  3. Ease of restrictions - Access will be given not at the bucket level (root or top-level) but only at the prefix level or the sub-folder level. Your teams can play around with in their own prefix

A typical example to create a global, centralized, unique AWS S3 bucket based on different environments is shown below by writing this AWS Cloud Formation template:-

AWSTemplateFormatVersion: '2010-09-09'
Metadata: 
  License: Unlicensed
Description: >
  This template creates a global unique S3 bucket in a specific region which is unique. 
  The bucket name is formed by the environment, account id and region


Parameters:
 #https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/parameters-section-structure.html

  Environment:
    Description: This paramenter will accept the environment details from the user
    Type: String
    Default: sbx
    AllowedValues:
      - sbx
      - dev
      - qa
      - e2e
      - prod
    ConstraintDescription: Invalid environment. Please select one of the given environments only


Resources:

  #https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-s3-bucket.html

  MyS3Bucket:
      Type: AWS::S3::Bucket
      DeletionPolicy: Retain
      Properties: 
        BucketName: !Sub 'global-bucket-${Environment}-${AWS::Region}-${AWS::AccountId}' 

#https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/pseudo-parameter-reference.html

        AccessControl: Private                
        LoggingConfiguration:
          DestinationBucketName: !Ref 'LoggingBucket'
          LogFilePrefix:  'access-logs'
        Tags:
          - Key: name
            Value: globalbucket
          - Key: department
            Value: engineering
  LoggingBucket:
    Type: AWS::S3::Bucket
    DeletionPolicy: Retain
    Properties:
      BucketName: !Sub 'global-loggings-${Environment}-${AWS::Region}-${AWS::AccountId}'
      AccessControl: LogDeliveryWrite      


Outputs:
  MyS3Bucket:
    Description: A private S3 bucket with deletion policy as retain and logging configuration
    Value: !Ref MyS3Bucket
    Export:
      Name: global-bucket

     

Then import the value of the bucket in any Cloud Formation resource. For instance, like below where we are importing it into the Lambda as an environment variable:-

AWSTemplateFormatVersion: '2010-09-09'
Metadata: 
  License: Unlicensed
Description: >
  This template creates a lambda function which gets triggered by any event occured in the S3 global bucket


Parameters:
  
#https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/parameters-section-structure.html
  Environment:
    Description: This paramenter will accept the environment details from the user
    Type: String
    Default: sbx
    AllowedValues:
      - sbx
      - dev
      - qa
      - e2e
      - prod
    ConstraintDescription: Invalid environment. Please select one of the given environments only


Resources:

  #https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-lambda-function.html
  HelloLambda:
    Type: AWS::Lambda::Function
    Properties:
      Code: 
        ZipFile: |
          var aws = require('aws-sdk')
          var response = require('cfn-response')
          exports.handler = function(event, context) {
              console.log("REQUEST RECEIVED:\n" + JSON.stringify(event))
              // For Delete requests, immediately send a SUCCESS response.
              if (event.RequestType == "Delete") {
                  response.send(event, context, "SUCCESS")
                  return
              }
              var responseStatus = "FAILED"
              var responseData = {}
              var functionName = event.ResourceProperties.FunctionName
              var lambda = new aws.Lambda()
              lambda.invoke({ FunctionName: functionName }, function(err, invokeResult) {
                  if (err) {
                      responseData = {Error: "Invoke call failed"}
                      console.log(responseData.Error + ":\n", err)
                  }
                  else responseStatus = "SUCCESS"
                  response.send(event, context, responseStatus, responseData)
              })
          }
      Description: >
          This is just a sample hello world lambda that uses prefix of an existing S3 bucket
      Environment:
        Variables:
          BUCKET_NAME: !ImportValue global-bucket  
      FunctionName: !Sub 'hellolambda-${Environment}-${AWS::Region}-${AWS::AccountId}'
      Handler: index.handler
      MemorySize: 128
      ReservedConcurrentExecutions: 0
      Role: !GetAtt LambdaExecutionRole.Arn
      Runtime: nodejs12.x
      Tags: 
        - Key: name
          Value: testlambda
      Timeout: 10
  LambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: "/"
      Policies:
      - PolicyName: root
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - logs:*
            Resource: arn:aws:logs:*:*:*               


Hope this article has given an insight into the importance of keeping a single S3 architecture in a company.

Do share your feedback and comments on this :)

Cheers

要查看或添加评论,请登录

Vinod Kumar Nair的更多文章

社区洞察

其他会员也浏览了