登录查看更多内容

Day - 06 | Amazon S3 | AWS Cloud Practitioner Certification CLF-C02

Anshul Agarwal

? SDET + DevOps ? | Selenium/Appium (Java & Python) | API testing (Postman + RestAssured) | Cypress | WebdriverIO | Playwright | Robot Framework | CI/CD | Python | AWS | Docker | Linux | Terraform | Jenkins |

发布日期: 2024年7月23日

+ 关注

Amazon S3

? S3 Use cases

? Amazon S3 Overview - Buckets

? Amazon S3 Overview - Objects

? S3 Security

? S3 Bucket Policies

? Bucket settings for Block Public Access

? S3 Websites

? S3 - Versioning

? S3 Access Logs

? S3 Replication (CRR & SRR)

? S3 Storage Classes

? S3 Durability and Availability

? S3 Standard General Purpose

? S3 Storage Classes - Infrequent Access

? S3 Standard Infrequent Access (S3 Standard-IA)

? S3 One Zone Infrequent Access (S3 One Zone-IA)

? Amazon S3 Glacier Storage Classes

? Amazon S3 Glacier Instant Retrieval

? Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)

? Amazon S3 Glacier Deep Archive - for long term storage

? S3 Intelligent-Tiering

? S3 Object Lock & Glacier Vault Lock

? Shared Responsibility Model for S3

? AWS Snow Family

? Data Migrations with AWS Snow Family

? Time to Transfer

? Snowball Edge (for data transfers)

? AWS Snowcone

? AWS Snowmobile

? Snow Family - Usage Process

? What is Edge Computing?

? Snow Family - Edge Computing

? AWS OpsHub

? Hybrid Cloud for Storage

? AWS Storage Gateway

? Amazon S3 - Summary

Amazon S3

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. Customers of all sizes and industries can use S3 to store and protect any amount of data for a range of use cases, such as websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics.

S3 Use cases

Backup and storage
Disaster Recovery
Archive
Hybrid Cloud storag e
Application hosting
Media hosting
Data lakes & big data analytics
Software delivery
Static website

Amazon S3 Overview - Buckets

Buckets are the fundamental containers in Amazon S3 for storing data (objects/files). Each bucket can hold an unlimited number of objects and serves as a namespace for objects within it. Buckets are identified by a globally unique name (across all regions all accounts). Buckets are defined at the region level. S3 looks like a global service but buckets are created in a region. Naming convention:

? No uppercase

? No underscore

? 3-63 characters long

? Not an IP

? Must start with lowercase letter or number

Amazon S3 Overview - Objects

Objects (files) have a Key
The key is the FULL path:

? s3://my-bucket/my_file.txt

? s3://my-bucket/my_folder1/another_folder/my_file.txt

The key is composed of prefix + object name

? s3://my-bucket/my_folder1/another_folder/my_file.txt

There’s no concept of “directories” within buckets (although the UI will trick you to think otherwise)
Just keys with very long names that contain slashes ('/')
Object values are the content of the body:

? Max Object Size is 5TB (5000GB)

? If uploading more than 5GB, must use “multi-part upload”

Metadata (list of text key / value pairs – system or user metadata)

? Tags (Unicode key / value pair – up to 10) – useful for security / lifecycle

? Version ID (if versioning is enabled)

S3 Security

User based -

? IAM policies - which API calls should be allowed for a specific user from IAM console

Resource Based -

? Bucket Policies - bucket wide rules from the S3 console - allows cross account

? Object Access Control List (ACL) – finer grain

? Bucket Access Control List (ACL) – less common

Note: an IAM principal can access an S3 object if

? the user IAM permissions allow it OR the resource policy ALLOWS it

? AND there’s no explicit DENY

Encryption: encrypt objects in Amazon S3 using encryption keys

S3 Bucket Policies

Bucket policies are JSON-based access policy language that you can use to manage permissions for S3 buckets. They define what actions are allowed or denied for which principals (users) on the specified resources

Use S3 bucket for policy to:

? Grant public access to the bucket

? Force objects to be encrypted at upload

? Grant access to another account (Cross Account)

Bucket settings for Block Public Access

Amazon S3 provides settings to block public access to your S3 resources. This feature helps prevent unintended public access and helps you adhere to best practices for securing your S3 data.

Block all public access: On
These settings were created to prevent company data leaks
If you know your bucket should never be public, leave these on
Can be set at the account level

S3 Websites

Amazon S3 can host static websites over www. You can configure your bucket to serve static web content, set up an index document, and manage error documents. If you get a 403 (Forbidden) error, make sure the bucket policy allows public reads!

S3 - Versioning

Versioning in Amazon S3 allows you to keep multiple versions of an object in the same bucket. This feature helps protect against accidental overwrites and deletions. Any file that is not versioned prior to enabling versioning will have version “null”. If the user suspends versioning, this will not delete the previous versions.

S3 Access Logs

Amazon S3 provides the capability to log all access requests made to your S3 buckets. Access logs can be analyzed to track and audit usage patterns and permissions. Any request made to S3, from any account, authorized or denied, will be logged into another S3 bucket. That data can be analyzed using data analysis tools.

S3 Replication (CRR & SRR)

Cross-Region Replication (CRR): Automatically replicates objects across different AWS regions. Use cases: compliance, lower latency access, replication across accounts
Same-Region Replication (SRR): Replicates objects within the same region. Use cases: log aggregation, live replication between production and test accounts.
Must enable versioning in source and destination.
Buckets can be in different accounts
Copying is asynchronous
Must give proper IAM permissions to S3

领英推荐

What is AWS S3?

Neal K. Davis 3 年前

Comparing Cloud Platforms for Databricks: Azure, AWS…

Sanjay Kumar MBA,MS,PhD 2 周前

AWS FinOps - Reducing Costs with AWS Simple Storage…

Shiekh Mudasir Nazir 3 周前

S3 Storage Classes

Amazon S3 Standard - General Purpose
Amazon S3 Standard - Infrequent Access (IA)
Amazon S3 One Zone - Infrequent Access
Amazon S3 Glacier Instant Retrieval
Amazon S3 Glacier Flexible Retrieval
Amazon S3 Glacier Deep Archive
Amazon S3 Intelligent Tiering

NOTE -> Can move between classes manually or using S3 Lifecycle configurations

S3 Durability and Availability

Durability:

? High durability (99.999999999%, 11 9’s) of objects across multiple AZ

? If you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years

? Same for all storage classes

Availability:

? Measures how readily available a service is

? Varies depending on storage class

? Example: S3 standard has 99.99% availability = not available 53 minutes a year

S3 Standard General Purpose

Designed for frequently accessed data and have availability that requires high throughput and low latency. This can sustain 2 concurrent facility failures.

Use Cases: Big Data analytics, mobile & gaming applications, content distribution…

S3 Storage Classes - Infrequent Access

For data that is less frequently accessed, but requires rapid access when needed. Lower cost than S3 Standard.

S3 Standard Infrequent Access (S3 Standard-IA)

99.9% Availability

Use cases: Disaster Recovery, backups

S3 One Zone Infrequent Access (S3 One Zone-IA)

High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed
99.5% Availability
Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate

Amazon S3 Glacier Storage Classes

Low-cost object storage meant for archiving / backup
Pricing: price for storage + object retrieval cost

Amazon S3 Glacier Instant Retrieval

Millisecond retrieval, great for data accessed once a quarter
Minimum storage duration of 90 days

Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)

Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) – free
Minimum storage duration of 90 days

Amazon S3 Glacier Deep Archive - for long term storage

Standard (12 hours), Bulk (48 hours)
Minimum storage duration of 180 days

S3 Intelligent-Tiering

This storage class automatically moves data between two access tiers (frequent and infrequent) to optimize costs based on changing access patterns.

Small monthly monitoring and auto-tiering fee
Moves objects automatically between Access Tiers based on usage
There are no retrieval charges in S3 Intelligent-Tiering
Frequent Access tier (automatic): default tier
Infrequent Access tier (automatic): objects not accessed for 30 days
Archive Instant Access tier (automatic): objects not accessed for 90 days
Archive Access tier (optional): configurable from 90 days to 700+ days
Deep Archive Access tier (optional): config from 180 days to 700+ days

S3 Object Lock & Glacier Vault Lock

S3 Object Lock: Prevents objects from being deleted or overwritten for a specified retention period.
Glacier Vault Lock: Enforces compliance controls on individual S3 Glacier vaults. Adopt a WORM (Write Once Read Many) model. This Locks the policy for future edits (can no longer be changed).

Shared Responsibility Model for S3

AWS and customers share responsibility for security and compliance:

AWS: Responsible for the infrastructure security of the cloud.
Customers: Responsible for managing their data, identity, and access management within S3.

AWS Snow Family

Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
Data migration: ? Snowcone, ? Snowball Edge, ? Snowmobile
Edge computing: ? Snowcone, ? Snowball Edge

Data Migrations with AWS Snow Family

AWS Snow Family: Offline devices to perform data migrations. If it takes more than a week to transfer over the network, use Snowball devices!
Challenges:

? Limited connectivity

? Limited bandwidth

? High network cost

? Shared bandwidth (can’t maximize the line)

? Connection stability

Data migration

? AWS Snowcone ?

Small, portable computing, anywhere, rugged & secure, withstands harsh environments
Light (4.5 pounds, 2.1 kg)
Device used for edge computing, storage, and data transfer
8 TBs of usable storage
Use Snowcone where Snowball does not fit (space-constrained environment)
Must provide your own battery / cables
Can be sent back to AWS offline, or connect it to internet and use AWS DataSync to send data

? Snowball Edge (for data transfers) ?

Physical data transport solution: move TBs or PBs of data in or out of AWS
Alternative to moving data over the network (and paying network fees)
Pay per data transfer job
Provide block storage and Amazon S3-compatible object storage
Snowball Edge Storage Optimized - 80 TB of HDD capacity for block volume and S3 compatible object storage
Snowball Edge Compute Optimized - 42 TB of HDD capacity for block volume and S3 compatible object storage
Use cases: large data cloud migrations, DC decommission, disaster recovery

? AWS Snowmobile ?

Transfer exabytes of data (1 EB = 1,000 PB = 1,000,000 TBs)
Each Snowmobile has 100 PB of capacity (use multiple in parallel)
High security: temperature controlled, GPS, 24/7 video surveillance
Better than Snowball if you transfer more than 10 PB

Snow Family - Usage Process

Request Snowball devices from the AWS console for delivery
Install the snowball client / AWS OpsHub on your servers
Connect the snowball to your servers and copy files using the client
Ship back the device when you’re done (goes to the right AWS facility)
Data will be loaded into an S3 bucket
Snowball is completely wiped

What is Edge Computing?

Process data while it’s being created on an edge location : A truck on the road, a ship on the sea, a mining station underground…
These locations may have Limited / no internet access or Limited / no easy access to computing power
We setup a Snowball Edge / Snowcone device to do edge computing
Use cases of Edge Computing: Preprocess data, Machine learning at the edge, Transcoding media streams
Eventually (if need be) we can ship back the device to AWS (for transferring data for example)

Snow Family - Edge Computing

Snowcone (smaller) 2 CPUs, 4 GB of memory, wired or wireless access, USB-C power using a cord or the optional battery
Snowball Edge – Compute Optimized - 52 vCPUs, 208 GiB of RAM, Optional GPU (useful for video processing or machine learning), 42 TB usable storage
Snowball Edge – Storage Optimized - Up to 40 vCPUs, 80 GiB of RAM, Object storage clustering available
All: Can run EC2 Instances & AWS Lambda functions (using AWS IoT Greengrass)
Long-term deployment options: 1 and 3 years discounted pricing

AWS OpsHub

Historically, to use Snow Family devices, you needed a CLI (Command Line Interface tool)
Today, you can use AWS OpsHub (a software you install on your computer / laptop) to manage your Snow Family Device. Unlocking and configuring single or clustered devices. Transferring files. Launching and managing instances running on Snow Family Devices. Monitor device metrics (storage capacity, active instances on your device). Launch compatible AWS services on your devices (ex: Amazon EC2 instances, AWS DataSync, Network File System (NFS)).

Hybrid Cloud for Storage

AWS is pushing for ”hybrid cloud”Part of your infrastructure is on-premisesPart of your infrastructure is on the cloud
This can be due toLong cloud migrationsSecurity requirementsCompliance requirementsIT strategy
S3 is a proprietary storage technology (unlike EFS / NFS), so how do you expose the S3 data on-premise?
AWS Storage Gateway!

AWS Storage Gateway

Bridge between on-premise data and cloud data in S3
Hybrid storage service to allow on- premises to seamlessly use the AWS Cloud
Use cases: disaster recovery, backup & restore, tiered storage
Types of Storage Gateway: ? File Gateway ? Volume Gateway ? Tape Gateway

Amazon S3 - Summary

Buckets vs Objects: global unique name, tied to a region
S3 security: IAM policy, S3 Bucket Policy (public access), S3 Encryption
S3 Websites: host a static website on Amazon S3
S3 Versioning: multiple versions for files, prevent accidental deletes
S3 Access Logs: log requests made within your S3 bucket
S3 Replication: same-region or cross-region, must enable versioning
S3 Storage Classes: Standard, IA, 1Z-IA, Intelligent, Glacier, Glacier Deep Archive
S3 Lifecycle Rules: transition objects between classes
S3 Glacier Vault Lock / S3 Object Lock: WORM (Write Once Read Many)
Snow Family: import data onto S3 through a physical device, edge computing
OpsHub: desktop application to manage Snow Family devices
Storage Gateway: hybrid solution to extend on-premises storage to S3

Happy Learning !

要查看或添加评论，请登录

Anshul Agarwal的更多文章

Selenium WebDriver: Cross-Browser Testing Using Selenium Grid with Docker

2024年12月31日

Selenium WebDriver: Cross-Browser Testing Using Selenium Grid with Docker

Cross-browser testing ensures your web application works seamlessly across different browsers. With Selenium Grid 4 and…
A Comprehensive Guide : How to Test APIs and Large Language Models (LLMs)

2024年12月27日

A Comprehensive Guide : How to Test APIs and Large Language Models (LLMs)

API Testing 1. Understand the API API Documentation: Study endpoints, request/response formats, authentication methods,…

1 条评论
API Testing : Using Cypress

2024年12月24日

API Testing : Using Cypress

Here is a comprehensive tutorial for API Testing Using Cypress, designed to help you master API automation testing with…

1 条评论
Selenium - Interview Preparation Topics

2024年12月17日

Selenium - Interview Preparation Topics

1. Basics of Selenium What is Selenium? History and evolution of Selenium.
30-Day Learning Plan to master Selenium with Java, Page Object Model (POM), TestNG, and Cucumber BDD for Automation Testing

2024年12月16日

30-Day Learning Plan to master Selenium with Java, Page Object Model (POM), TestNG, and Cucumber BDD for Automation Testing

Here’s a 30-Day Learning Plan to master Selenium with Java, Page Object Model (POM), TestNG, and Cucumber BDD for…

1 条评论
Day - 11 | Cloud Integration | AWS Cloud Practitioner Certification CLF-C02

2024年12月13日

Day - 11 | Cloud Integration | AWS Cloud Practitioner Certification CLF-C02

? Cloud Integration ? Section Introduction ? Amazon SQS - Simple Queue Service ? Amazon Kinesis ? Amazon SNS ? Amazon…
Day - 10 | Global Infrastructure | AWS Cloud Practitioner Certification CLF-C02

2024年12月12日

Day - 10 | Global Infrastructure | AWS Cloud Practitioner Certification CLF-C02

? Why make a global application? ? Global AWS Infrastructure ? Global Applications in AWS ? Amazon Route 53 Overview ?…

1 条评论
AWS Certified Cloud Practitioner (AWS-CLF-C02)

2024年12月11日

AWS Certified Cloud Practitioner (AWS-CLF-C02)

AWS Certified Cloud Practitioner (AWS-CLF-C02): Essential Services at a Glance! ?? ?? Compute ?? EC2 (Elastic Compute…

3 条评论
Mastering Mock API Testing with Cypress!

2024年11月30日

Mastering Mock API Testing with Cypress!

When it comes to frontend testing, handling dynamic API responses can be tricky. But with Cypress, mocking API…
YAML Tutorial: A Comprehensive Guide ????

2024年11月25日

YAML Tutorial: A Comprehensive Guide ????

What is YAML? ?? YAML (short for "YAML Ain't Markup Language") is a human-readable data serialization format commonly…

1 条评论

See all articles

Amazon S3

Amazon S3

S3 Use cases

Amazon S3 Overview - Buckets

Amazon S3 Overview - Objects

S3 Security

S3 Bucket Policies

Bucket settings for Block Public Access

S3 Websites

S3 - Versioning

S3 Access Logs

S3 Replication (CRR & SRR)

领英推荐

S3 Storage Classes

S3 Durability and Availability

S3 Standard General Purpose

S3 Storage Classes - Infrequent Access

S3 Standard Infrequent Access (S3 Standard-IA)

S3 One Zone Infrequent Access (S3 One Zone-IA)

Amazon S3 Glacier Storage Classes

Amazon S3 Glacier Instant Retrieval

Amazon S3 Glacier Flexible Retrieval (formerly Amazon S3 Glacier)

Amazon S3 Glacier Deep Archive - for long term storage

S3 Intelligent-Tiering

S3 Object Lock & Glacier Vault Lock

Shared Responsibility Model for S3

AWS Snow Family

Data Migrations with AWS Snow Family

Data migration

? AWS Snowcone ?

? Snowball Edge (for data transfers) ?

? AWS Snowmobile ?

Snow Family - Usage Process

What is Edge Computing?

Snow Family - Edge Computing

AWS OpsHub

Hybrid Cloud for Storage

AWS Storage Gateway

Amazon S3 - Summary

Anshul Agarwal的更多文章

Selenium WebDriver: Cross-Browser Testing Using Selenium Grid with Docker

A Comprehensive Guide : How to Test APIs and Large Language Models (LLMs)

API Testing : Using Cypress

Selenium - Interview Preparation Topics

30-Day Learning Plan to master Selenium with Java, Page Object Model (POM), TestNG, and Cucumber BDD for Automation Testing

Day - 11 | Cloud Integration | AWS Cloud Practitioner Certification CLF-C02

Day - 10 | Global Infrastructure | AWS Cloud Practitioner Certification CLF-C02

AWS Certified Cloud Practitioner (AWS-CLF-C02)

Mastering Mock API Testing with Cypress!

YAML Tutorial: A Comprehensive Guide ????

社区洞察

其他会员也浏览了

Auto-Scaling Stateful Applications: AWS vs. GCP

Day 48 Top AWS Interview Questions and Answers for 2024

How to Optimize Performance and Cost for Prometheus & Grafana Pods on EKS Fargate

AWS update of Week 6 (6Feb-12Feb)

AWS – Cloud Storage Types – S3,EBS & EFS

How banks can achieve high-performance computing on AWS

WC 24/10/14 AWS Whats New

Storage on Amazon Web Services(AWS)

Unlocking the Power of Amazon Storage Services

AWS S3 and Beyond: Building a Future-Proof Cloud Storage Strategy