S3 FEATURES:
Tiered Storage available.
Lifecycle Management.
Versioning.
Encryption.
MFA For Deletion
Securing data using Access Control Lists & Bucket Policies.
S3 STORAGE CLASSES:
S3 Standard:
S3 IA (Infrequently Accessed):
S3 One Zone IA: (RRS old service)
S3 Intelligent Tiering: (ML on how frequently accessed and moves from S3 Standard to IA).
S3 Glacier: For data archiving. Mins to hours for retrieving.
S3 Deep Glacier Archive: Lowest storage class, retrieval is 12 hours & 48hrs for bulk
S3 COMPARISON:
S3 CHARGING:
For Storing.
On Requests.
Storage Management pricing (Type of class for storing).
Data Transfer Pricing. (Cross Region Replication)
Transfer Acceleration. (Edge Location)
Exam Tips:
S3 is Object Based (i.e allows you to upload files).
Files can be from 0KB to 5TB of size.
There is unlimited storage.
Files are stored in a bucket.
S3 is a universal namespace i.e the name should be unique across the globe.
Since it is Object Based it is not suitable to install OS or database on S3.
Only to store files.
On Successful Upload you will get HTTP 200OK
You can turn on MFA for deletion. So that the data won’t get deleted accidentally.
Control access to bucket using bucket ACL or bucket policies
KEY FUNDAMENTALS OF S3:
Key: Filename (Simply name of the file).
Value: Simply a data and is made up of a sequence of bytes.
Version ID: Important in terms of versioning.
Metadata: Information about the data you are storing (data about the data).
Sub Resources:
Access Control Lists. (permission)
Torrents.
Consistency:
Read after Write consistency for PUTS of new object.
Eventual Consistency for Overwrite of PUTS and DELETES (can take some time to propagate the changes.)
IMPORTANT NOTE:
Read the FAQs.
Read the whitepapers.
S3 Pricing Tier:
What drives the price:
Storage.
Requests & Data Retrieval.
Data Transfer.
Management & Replication.
What are the different Tiers:
S3 Standard.
S3 IA
S3 One Zone IA
S3 Intelligent Tiering
S3 Glacier
S3 Deep Glacier Archive.
Understanding how to get best out of S3.
S3 Standard -> most expensive.
S3 Intelligent Tiering.
S3 IA
S3 One Zone IA
S3 Glacier.
S3 Deep Glacier Archive.
Tip: Avoid S3 Standard as much as possible.
Scenario based questions.
S3 Security & Encryption:
By Default all the buckets are private
You can setup access control to your bucket using:
Access Control Lists → Object Level
Bucket Policies → Bucket Level
S3 buckets can be configured to create access logs which logs all requests made to the s3 bucket. This can be sent to another bucket or in another bucket from another account.
Note: S3-> access bucket monitoring. Logging service enable (new bucket for logs)
Encryption: (https) SSL/TLS
Encryption at rest (Server Side)
S3 Managed Keys-- SSE S3 (Server side encryption s3)
AWS Key Management Service Managed keys (SSE-KMS)
Server Side Encryption with customer provided keys- SSE-C.
Client Side Encryption (upload encrypted file).
VPC endpoint for private access to s3
MFA on delete
Pre-signed URLs (temporary urls created for specific time period)
S3 Versioning
Stores all versions of all objects including all writes and even if you delete an object.
Great backup tool (mysql backup versioning)
Once enabled versioning it cannot be disabled. Only be suspended. You will need to delete the bucket and recreate it to completely disable it.
Integrates with lifecycle rules.
Versionings MFA for delete (security)
Tips:
Stores all the versions of object (including write and delete )
Backup tool
Once enabled cannot be disabled.
Integrates with Lifecycle rule
Versioning MFA capability for delete which uses multi-factor authentication for delete.
S3 Lifecycle Management:
Automates moving objects between the different storage tiers.
Conjunction with versioning
Can be applied to the current version as well as the previous version.
Useful Links:
https://d1.awsstatic.com/whitepapers/Security/AWS_Security_Best_Practices.pdf
S3 Object Lock:
To secure your S3 object from read/write or delete.
It’s a write once and read many models (WORM).
S3 Object Lock Governance mode:
User can’t overwrite or delete an object or alter an object unless it has special permission to do so
In Governance mode you protect your objects against being deleted by most of the users, but you can still grant some permission to alter the retention settings or delete the object if necessary.
S3 Object Lock Compliance mode:
In this mode the object can’t be deleted or can’t be altered by any user including the root user.
Can’t change the retention mode and its retention period can’t be shortened.
Compliance mode ensures that the version of the object can’t be overwritten or deleted for the duration of retention period.
Retention Period:
Legal Hold:
Glacier Vault:
CLI -> /folder1/sub1/1
S3 performance:
What is S3 Prefixes?:
https://<bucket_name>/folder1/subfolder1/test.jpeg
Prefix is folder1/subfolder1
https://<bucket_name>/folder2/subfolder1/test2.jpeg
Prefix is folder2/subfolder1
https://<bucket_name>/folder3/abhay.jpeg
Prefix is folder3
S3 Performance:
S3 Has extremely low latency
You can get first byte out of S3 in 100-200 milliseconds
You can also achieve a high no of requests:
3000 PUT/COPY/DELETE/POST req per second per prefix.
5500 GET/HEAD req per second per prefix.
You can get better performance by spreading your reads across different prefixes.
Eg: If you are using 2 prefixes you can achieve 11000 reqs per second
Eg: If you are using 4 prefixes you can achieve 22000 reqs per second
Note: more no of prefix you get better performance
Multipart uploads:
Parallel upload
S3 Byte range fetch (download)
Speedup download
You can design how much % of data you want to download (partial amount of file download)
S3 Limitation while using KMS
Quota issue [file-> upload → encrypts & download→ decrypt]
Limit
S3 Select:
Allowing to use a sql select query.
Eg: csv file zipped in bucket. The select will grep the data directly.
Get data in rows & columns
Retrieve only a subset of data.
Simple Sql expressions
Save money on data transfer and increase speed.
Glacier Select:
Same like S3 Select but can be used on Glacier
No comments:
Post a Comment