Amazon S3
- object based storage
- max storage 5 TB
- if uploading more than 5gb use multi upload
- also has metadata, tags and optional version id
- read after write consistency
- 3,5k writes and 5k puts per second
Requester Pays
- Requester pays network and downlaod cost
- use to share large data sets with other accounts
- user must be autherized in aws
S3 event notifications
- trigger an action (lambda) on api request to the bucket
- can also use event bride for more targets and filters
Targtes
S3 Access Logs
- monitor traffic requests to an S3 bucket
- can be used for access and security audits
- understand s3 bill
MFA-protected API Access
- enforce MFA for access to S3 ressources
MFA delte
- enforce MFA for delte object action
Versioning
- if an object is updated the old version still persists within the bucket
- if an object is deleted the new version will be a delete marker, and the old version will still persist within the bucket
Commands
S3 Sync
- copies objects between buckets
- does diff
- used last modified to update
- if versioned only copies current version
aws s3 sync s3://DOC-EXAMPLE-BUCKET-SOURCE s3://DOC-EXAMPLE-BUCKET-TARGET
Storage Tiers
- can transition to infrequent access only after 30 days
Glacier
- some applications can save data directly into Glacier (e.g. Tape gateway, data sync) , you can also do this manually by using cli
- snowball transfer cannot be directly inserted only transfered from default tier
- low cost but higher retrieval cost and time
- retrieve within 12 hours
Expedited Retrival
- 1 to 5 minutes of retrival time
Provisoned Retrival Capacity
- guranteed up to 150mbs retrival speed
Glacier Deep Archive
- 75% cheeper than Glacier
- higher retriveal feed than glacier
- no expedited retrival
- can retrieve at lower fee using bulk retrieve but this will take 48 hours
Infreqnet Access
- less freq access, but instantly available
- low per GB Storage price
- retrival fee
One Zone / One Zone IA
- single AZ storage 20% less cost
Livecycle Rules
Transiton Actions
- move to diffrent storage tier after time
- can transition to infrequent access only after 30 days
Transfer Limitations for Livecycle Rules
Expiration Actions
- delte access logs
- delete old versions of files
- delte incomplete multi part uploads
- can apply to paths or full bucket
Tranfer Accelerator
- fast transfer over long distances
- uses cloudfront edge locations
- routes via optimized path
Multipard uploaads
- transmit object in chuncks
- if one chunk fails, only that cuck needs to be retransmitted
- over 100mb should be considered
- improved throughput
S3 Bucket Policies
- add or deny permissions to objects
- can be attached to users groups or buckets
- can grant access to other aws accounts
- can restrict based on multi coditions for the requets (ip, time, ssl )
ACL
- can also use ACL to grant another account access
Encryption
SEE S3
- Server Side encryption of objects
- aws manged
- replication by default
SEE C
- Server Side encryption
- customer managed keys
- obhects can not be replicated
SSE KMS
- Need to specify need key for object in new bucket
- need iam role to decrypt with source key and encrypt with new key
- might get KMS throttling errors, need to ask for service quotas increase
- can use multi region kms keys
Client Side Encryption
- Encrypt before upload and decrypt after download using own mechanism
Replication
- must have versioning in both buckets
- can be cross region or same region
- can be bucket in other account
- async
- use batch replication to replicate existing ones
- cant chain replication
Use cases
- compliance
- latency
- replication across accounbts
- log aggregation
- sync data to test account from prod
- 100-200ms
- 3,5k put copy post delte per prefix
- 5,5k read per prefix
- use multi part upload for parallell upload
- transfer accel to upload to edge location instead of directyl to s3, also use for dl if files are bigger than 1GB
- byte range fetches -> multi part download
Select
- filter data server side to reduce network cost (100 lines out of 1 mil lines csv)
Batch operations
- modify all metadata and properies
- copy objects between buckets
- encrypt all unencrypted objects
- modify acl and tags
- restore objects from glacier
- invoke lambda for each object
- use s3 select to filter then batch
Vault
- Objects can not be modified or deleted
- used for compliance
- Can lock the vault policy from future edits
Server Access Logging
- If CloudTrail is not enough information you use this
Features
- including referer
- including turn around time