Skip to content

Pasted image 20221101113016.png

Amazon S3

  • object based storage
  • max storage 5 TB
  • if uploading more than 5gb use multi upload
  • also has metadata, tags and optional version id
  • read after write consistency
  • 3,5k writes and 5k puts per second

Requester Pays

  • Requester pays network and downlaod cost
  • use to share large data sets with other accounts
  • user must be autherized in aws

S3 event notifications

  • trigger an action (lambda) on api request to the bucket
  • can also use event bride for more targets and filters

Targtes

S3 Access Logs

  • monitor traffic requests to an S3 bucket
  • can be used for access and security audits
  • understand s3 bill

MFA-protected API Access

  • enforce MFA for access to S3 ressources

MFA delte

  • enforce MFA for delte object action

Versioning

  • if an object is updated the old version still persists within the bucket
  • if an object is deleted the new version will be a delete marker, and the old version will still persist within the bucket

Commands

S3 Sync

  • copies objects between buckets
  • does diff
  • used last modified to update
  • if versioned only copies current version
aws s3 sync s3://DOC-EXAMPLE-BUCKET-SOURCE s3://DOC-EXAMPLE-BUCKET-TARGET

Storage Tiers

  • can transition to infrequent access only after 30 days

Glacier

  • some applications can save data directly into Glacier (e.g. Tape gateway, data sync) , you can also do this manually by using cli
  • snowball transfer cannot be directly inserted only transfered from default tier
  • low cost but higher retrieval cost and time
  • retrieve within 12 hours

Expedited Retrival

  • 1 to 5 minutes of retrival time

Provisoned Retrival Capacity

  • guranteed up to 150mbs retrival speed

Glacier Deep Archive

  • 75% cheeper than Glacier
  • higher retriveal feed than glacier
  • no expedited retrival
  • can retrieve at lower fee using bulk retrieve but this will take 48 hours

Infreqnet Access

  • less freq access, but instantly available
  • low per GB Storage price
  • retrival fee

One Zone / One Zone IA

  • single AZ storage 20% less cost

Livecycle Rules

Transiton Actions

  • move to diffrent storage tier after time
  • can transition to infrequent access only after 30 days

Transfer Limitations for Livecycle Rules

Pasted image 20221101123650.png

Expiration Actions

  • delte access logs
  • delete old versions of files
  • delte incomplete multi part uploads
  • can apply to paths or full bucket

Tranfer Accelerator

  • fast transfer over long distances
  • uses cloudfront edge locations
  • routes via optimized path

Multipard uploaads

  • transmit object in chuncks
  • if one chunk fails, only that cuck needs to be retransmitted
  • over 100mb should be considered
  • improved throughput

S3 Bucket Policies

  • add or deny permissions to objects
  • can be attached to users groups or buckets
  • can grant access to other aws accounts
  • can restrict based on multi coditions for the requets (ip, time, ssl )

ACL

  • can also use ACL to grant another account access

Encryption

SEE S3

  • Server Side encryption of objects
  • aws manged
  • replication by default

SEE C

  • Server Side encryption
  • customer managed keys
  • obhects can not be replicated

SSE KMS

  • Need to specify need key for object in new bucket
  • need iam role to decrypt with source key and encrypt with new key
  • might get KMS throttling errors, need to ask for service quotas increase
  • can use multi region kms keys

Client Side Encryption

  • Encrypt before upload and decrypt after download using own mechanism

Replication

  • must have versioning in both buckets
  • can be cross region or same region
  • can be bucket in other account
  • async
  • use batch replication to replicate existing ones
  • cant chain replication

Use cases

  • compliance
  • latency
  • replication across accounbts
  • log aggregation
  • sync data to test account from prod

Performance

  • 100-200ms
  • 3,5k put copy post delte per prefix
  • 5,5k read per prefix
  • use multi part upload for parallell upload
  • transfer accel to upload to edge location instead of directyl to s3, also use for dl if files are bigger than 1GB
  • byte range fetches -> multi part download

Select

  • filter data server side to reduce network cost (100 lines out of 1 mil lines csv)

Batch operations

  • modify all metadata and properies
  • copy objects between buckets
  • encrypt all unencrypted objects
  • modify acl and tags
  • restore objects from glacier
  • invoke lambda for each object
  • use s3 select to filter then batch

Vault

  • Objects can not be modified or deleted
  • used for compliance
  • Can lock the vault policy from future edits

Server Access Logging

  • If CloudTrail is not enough information you use this

Features

  • including referer
  • including turn around time