Friday, August 28, 2020

AWS Simple Storage Service (S3)

 S3 FEATURES:

  1. Tiered Storage available.

  2. Lifecycle Management.

  3. Versioning.

  4. Encryption.

  5. MFA For Deletion

  6. Securing data using Access Control Lists & Bucket Policies.


S3 STORAGE CLASSES:

  1. S3 Standard:

  2. S3 IA (Infrequently Accessed):

  3. S3 One Zone IA: (RRS old service)

  4. S3 Intelligent Tiering: (ML on how frequently accessed and moves from S3 Standard to IA).

  5. S3 Glacier: For data archiving. Mins to hours for retrieving. 

  6. S3 Deep Glacier Archive: Lowest storage class, retrieval is 12 hours & 48hrs for bulk


S3 COMPARISON:


S3 CHARGING:

  1. For Storing.

  2. On Requests.

  3. Storage Management pricing (Type of class for storing).

  4. Data Transfer Pricing. (Cross Region Replication)

  5. Transfer Acceleration. (Edge Location)


Exam Tips:

  1. S3 is Object Based (i.e allows you to upload files).

  2. Files can be from 0KB to 5TB of size.

  3. There is unlimited storage.

  4. Files are stored in a bucket.

  5. S3 is a universal namespace i.e the name should be unique across the globe.

    1. https://abhay.s3.amazonaws.com

    2. https://abhay.eu-west-1.amazonaws.com

  6. Since it is Object Based it is not suitable to install OS or database on S3.

  7. Only to store files.

  8. On Successful Upload you will get HTTP 200OK

  9. You can turn on MFA for deletion. So that the data won’t get deleted accidentally.

  10. Control access to bucket using bucket ACL or bucket policies


KEY FUNDAMENTALS OF S3:

  1. Key: Filename (Simply name of the file).

  2. Value: Simply a data and is made up of a sequence of bytes.

  3. Version ID: Important in terms of versioning.

  4. Metadata: Information about the data you are storing (data about the data).

  5. Sub Resources:

    1. Access Control Lists. (permission) 

    2. Torrents.


Consistency:

  1. Read after Write consistency for PUTS of new object.

  2. Eventual Consistency for Overwrite of PUTS and DELETES (can take some time to propagate the changes.)


IMPORTANT NOTE: 

  • Read the FAQs.

  • Read the whitepapers.


S3 Pricing Tier:

  • What drives the price:

    • Storage.

    • Requests & Data Retrieval.

    • Data Transfer.

    • Management & Replication.

  • What are the different Tiers:

    • S3 Standard.

    • S3 IA

    • S3 One Zone IA

    • S3 Intelligent Tiering

    • S3 Glacier

    • S3 Deep Glacier Archive.

  • Understanding how to get best out of S3.

    • S3 Standard -> most expensive.

    • S3 Intelligent Tiering.

    • S3 IA 

    • S3 One Zone IA

    • S3 Glacier.

    • S3 Deep Glacier Archive.

  • Tip: Avoid S3 Standard as much as possible. 

  • Scenario based questions.


S3 Security & Encryption:

  1. By Default all the buckets are private

  2. You can setup access control to your bucket using:

    1. Access Control Lists → Object Level

    2. Bucket Policies → Bucket Level

  3. S3 buckets can be configured to create access logs which logs all requests made to the s3 bucket. This can be sent to another bucket or in another bucket from another account.

  4. Note: S3-> access bucket monitoring. Logging service enable (new bucket for logs)

  5. Encryption: (https) SSL/TLS

  6. Encryption at rest (Server Side)

    1. S3 Managed Keys-- SSE S3 (Server side encryption s3)

    2. AWS Key Management Service Managed keys (SSE-KMS)

    3. Server Side Encryption with customer provided keys- SSE-C.

  7. Client Side Encryption (upload encrypted file).

  8. VPC endpoint for private access to s3

  9. MFA on delete

  10. Pre-signed URLs (temporary urls created for specific time period)


S3 Versioning

  1. Stores all versions of all objects including all writes and even if you delete an object.

  2. Great backup tool (mysql backup versioning)

  3. Once enabled versioning it cannot be disabled. Only be suspended. You will need to delete the bucket and recreate it to completely disable it.

  4. Integrates with lifecycle rules.

  5. Versionings MFA for delete (security)

Tips:

  1. Stores all the versions of object (including write and delete )

  2. Backup tool

  3. Once enabled cannot be disabled.

  4. Integrates with Lifecycle rule

  5. Versioning MFA capability for delete which uses multi-factor authentication for delete.


S3 Lifecycle Management:

  1. Automates moving objects between the different storage tiers.

  2. Conjunction with versioning 

  3. Can be applied to the current version as well as the previous version.

Useful Links:

https://d1.awsstatic.com/whitepapers/Security/AWS_Security_Best_Practices.pdf


S3 Object Lock:

  • To secure your S3 object from read/write or delete.

  • It’s a write once and read many models (WORM).


S3 Object Lock Governance mode:

  • User can’t overwrite or delete an object or alter an object unless it has special permission to do so

  • In Governance mode you protect your objects against being deleted by most of the users, but you can still grant some permission to alter the retention settings or delete the object if necessary.


S3 Object Lock Compliance mode:

  • In this mode the object can’t be deleted or can’t be altered by any user including the root user.

  • Can’t change the retention mode and its retention period can’t be shortened. 

  • Compliance mode ensures that the version of the object can’t be overwritten or deleted for the duration of retention period.


Retention Period:


Legal Hold:

Glacier Vault:


CLI -> /folder1/sub1/1



S3 performance:

  • What is S3 Prefixes?:

    • https://<bucket_name>/folder1/subfolder1/test.jpeg 

      • Prefix is folder1/subfolder1

    • https://<bucket_name>/folder2/subfolder1/test2.jpeg

      • Prefix  is folder2/subfolder1

    • https://<bucket_name>/folder3/abhay.jpeg

      • Prefix is folder3

  • S3 Performance:

    • S3 Has extremely low latency

    • You can get first byte out of S3 in 100-200 milliseconds

    • You can also achieve a high no of requests:

      • 3000 PUT/COPY/DELETE/POST req per second per prefix.

      • 5500 GET/HEAD req per second per prefix.

    • You can get better performance by spreading your reads across different prefixes.

      • Eg: If you are using 2 prefixes you can achieve 11000 reqs per second

      • Eg: If you are using 4 prefixes you can achieve 22000 reqs per second

        • Note: more no of prefix you get better performance

    • Multipart uploads:

      • Parallel upload

    • S3 Byte range fetch (download)

      • Speedup download

      • You can design how much % of data you want to download (partial amount of file download)


S3 Limitation while using KMS

  1. Quota issue [file-> upload → encrypts & download→ decrypt]

  2. Limit 

S3 Select:

  1. Allowing to use a sql select query. 

    1. Eg: csv file zipped in bucket. The select will grep the data directly.

  2. Get data in rows & columns

  3. Retrieve only a subset of data.

  4. Simple Sql expressions

  5. Save money on data transfer and increase speed.

Glacier Select:

  1. Same like S3 Select but can be used on Glacier






No comments:

Post a Comment

Terraform Cheat Sheet [WIP]

Installing Terraform