References
Introduction
Storage Tiers Or Classes
Charges
Security
Policies
- Bucket Public Settings
- Bucket Policy
Encryption
CORS Configuration
Performance Optimization

References

S3 FAQ - AWS

Introduction

Simple Storage Service
Provides developers and IT teams with secure, durable, highly-scalable object storage.
For files, images etc. not for Operating systems or databases.
Object based storage.
Data is spread across multiple devices and facilities.
Files can be from 0 Bytes to 5 TB.
Unlimited Storage.
Files are stored in Buckets(similar to folder).
S3 is a universal namespace ie. names must be unique globally same as DNS.
ex. https://s3-eu-west-1.amazonaws.com/acloudguru
HTTP 200 on successful upload for cli.
Read after write consistency for PUTS of new objects.
Eventual consistency for overwrite PUTS and DELETES.(can take some time to propagate).
S3 is Object based
- Key - name of the object
- Value - the data, made up of sequence of bytes.
- Version ID
- Metadata - Data about data being stored
- Subresources - bucket specific configuration
  - bucket policies, Access control list.
  - Cross origin resource sharing(CORS).
  - Transfer acceleration
Built for 99.99% Availability - time for which service is available
- Amazon guarantees 99.9% availability.
Amazon guarantees 99.999999999% durability - Amount of data that will not corrupt.
Tiered Storage
Lifecycle management
Versioning
Encryption
Secure your data - Access Control Lists and Bucket Policies

Storage Tiers Or Classes

S3

99.99% Availability
99.999999999% Durability
Stored redundantly across multiple devices in multiple facilities.
Designed to sustain loss of 2 facilities concurrently.

S3 IA (Infrequently Accessed)

For data that is accessed less frequently, but requires rapid access when needed.
Lower price than S3 but you are charged a retrieval fee.

S3 One Zone IA

Same as IA however data is stored in a single Availability zone.
99.5% Availability
99.999999999% Durability
Costs 20% lesser than regular S3-IA.

Reduced Redundancy Storage

99.99% availability and durability
Used for data that can be created if lost, eg. thumbnails.
Amazon suggesting to use normal S3 rather than Reduced Redundancy Storage, adjustments made in pricing
May phase out in future

Glacier

Very cheap
Used for archival only
Optimised for data that is infrequently accessed
It takes 3-5 hours to restore from glacier.

s3-tiers

S3 Intelligent Tiering

Unknown or unpredictable access patterns
2 tiers - frequent and infrequent access
Automatically moves your data to most cost-effective tier based on how frequently you access each object.
frequent access tier object not accessed for 30 days –> move to infrequent tier.
as soon as infrequent tier object is accessed, it is moved to frequent access tier.
99.999999999% durability
99.9% availability
Optimizes cost
No fees for accessing your data but small monthly fee for monitoring/automation $0.0025 per 1000 objects.

Charges

Storage per GB
Requests (Get, Put, Copy etc.)
Storage management pricing
- INventory, Analytics, and Object tags
Data Management Pricing
- Data transferred out of S3
Transfer Acceleration

Security

By default, all newly created buckets are private.
Can setup access control to buckets using
- Bucket Policies - Applied at bucket level
- Access Control List - Applied at an object level
S3 buckets can be configured to create access logs, which log all the requests made to the S3 bucket. These logs can be written to another bucket.

Policies

AWS --> S3 --> Shows global view of all regions --> select region --> Create bucket --> can create folders

Enable Versioning Checkbox
Enable Server Access Logging
Can use tags to track project costs.
Enable object level logging using AWS cloudTrail
Enable Default Encryption
- AES-256 : Uses server side encryption with Amazon S3-Managed Keys (SSE-S3)
- AWS-KMS : Uses Server Side Encryption with AWS KMA-Managed Keys(SSE-KMS)
CloudWatch request metrics

Select Folder --> upload files -->

Set permissions
can set encryption

Bucket Public Settings

select bucket --> Permissions -->

bucket-permissions-public-settings

If bucket policy is set as private, you cannot upload public objects to it.

Bucket Policy

Use policy generator to create document.
Policy Type - SQS, S3, VPC, IAM, SNS Topic
Effect - Allow, Deny
Principal - User (ARN), S3 Service, AWS account number etc. (Entity for policy)
Action - action for AWS Service
ARN (Amazon Resource Name) - bucket ARN (target object)
Click Add Statement to add policy statement, can add multiple statements
Click Generate policy
Copy paste the policy to bucket –> permissions –> bucket policy

bucket-permissions-policies

Encryption

Types of Encryption

Encryption In Transit
- SSL/TLS ie. https
- For data you are transferring to and from your bucket
Encryption At Rest - Server Side Encryption
- S3 Managed Keys - SSE-S3
  - AES-256, Advanced encryption standard 256 bits
  - Each object encrypted with its own key using strong multifactor encryption
  - Additional Step - Also encrypt the key with master key, AWS manages and rotates the keys for you.
- AWS Key Management Service, Managed Keys, SSE-KMS
  - AWS managed keys with additional benefits.
  - You get a Separate permission for use of an additional key called Envelope Key.
  - Envelope Key encrypts yours data’s encryption key.
  - You also get an audit trail for the use of your encryption keys, so you can see who used it and why.
- Server Side Encryption with customer provided keys - SSE-C
  - AWS manages encryption and decryption but you manage your own keys.
  - You are incharge of rotating those keys and lifecycle of those keys.
Client Side Encryption
- You encrypt the files yourself before uploading to S3.

Enforcing Encryption on S3 Buckets

Everytime a file is uploaded to S3, a PUT request is initiated.

encrypt-s3

PUT /filename HTTP/1.1
HOst bucket dns
Authorization
x-amz-meta-author: Faye
Expect: 100-continue
- Tells S3 not to send your RequestBody until recieves an acknowledgement.
- S3 can reject your request based on the contents of your header.
If file is to be encrypted at upload time, header will have x-amz-server-side-encryption parameter.
- x-amz-server-side-encryption: AES256 (SSE-S3 - S3 managed keys)
- x-amz-server-side-encryption: ams:kms (SSE-KMS - KMS managed keys)
- Can enforce use of SSE by using bucket policy which denies any s3 PUT request which doesn’t include header param x-amz-server-side-encryption

HTTP 100 Continue

Read more about 100 Continue

Informational status response code indicates that everything so far is OK and that the client should continue with the request or ignore it if it is already finished.

To have a server check the request’s headers, a client must send Expect: 100-continue as a header in its initial request and receive a 100 Continue status code in response before sending the body.

encrypt-s3-policy

encrypt-s3-policy-json

Here add “/*” for Resource if error shows up “Action does not apply to any resource in statement”

CORS Configuration

Cross origin resource sharing
S3 buckets could be used to host a static website.
- create public bucket
- properties –> static website hosting
  - index.html
  - error.html
to enable cors
- go to target bucket being accessed
- permissions –> CORS configuration –> update Allowed Origin
Use Case - Static Website
- put images, js, html in different buckets

Performance Optimization

By default, designed to support very high request rates.
However, if S3 buckets are routinely receiving >100 PUTS/LIST/DELETE or >100 GET requests per second, then there are some best practice guidelines to optimize S3 performance.
Guidance is based on type of workload you are running
- GET-Intensive Workloads - Use CloudFront content delivery service to get best performance. CloudFront will cache your most frequently accessed objects and will reduce latency for your GET requests.
- Mixed Request Type Workloads - a mix of GET, PUT, DELETE
  - the key names being used for your objects impact performance for intensive workloads.
  - S3 uses keyname to determine which partition an object will be stored in.
  - Use of sequencial keynames eg. name prefixed with timestamp or alphabetical sequence increases the likelihood of having many objects stored on same partition.
  - For heavy workloads this can cause I/O issues and contention.
  - By using random prefix (like hex hash) to the key names, you can force S3 to distribute your keys accross multiple partitions, distributing the I/O workload.
S3 performance updates
- 3,500 PUT requests per second
- 5.500 GET requests
- This negates the previous guidance to randomize your object names, now sequencial and logical naming patterns can used without any performance implications.