AWS S3
S3 or Amazon Simple Storage Service provides a simple web-services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, secure, fast, inexpensive infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.
INFO:
AWS Free Tier
Includes 5GB storage, 20,000 Get Requests, and 2,000 Put Requests with Amazon S3.
Introduction
Pricing
Standard | 0.02 per GB |
Infrequent Access | 0.01 per GB |
Glacier | 0.005 per GB |
Deep Glacier | 0.00099 per GB |
http://aws.amazon.com/s3/pricing/
http://calculator.s3.amazonaws.com/index.html
https://www.skeddly.com/tools/ebs-snapshot-cost-calculator/
Optimising Costs
How to analyze and reduce S3 storage usage
Costs Example
Monthly cost of storing in S3 Standard: 753.27 GB x $0.024/GB = $18.08 Monthly cost of storing in S3 Glacier: $3.67 753.27 GB x $0.0045/GB = $3.39 Glacier overhead: 855953 objects * 32 KB * $0.0045/GB = $0.12 S3 overhead: 855953 objects * 8 KB * $0.024/GB = $0.16
https://alestic.com/2012/12/s3-glacier-costs/
s3cmd
S3cmd is a tool for managing objects in Amazon S3 storage. It allows for making and removing "buckets" and uploading, downloading and removing "objects" from these buckets. It runs on Linux and Mac.
Installation
sudo -i apt-get install python3-setuptools wget https://netcologne.dl.sourceforge.net/project/s3tools/s3cmd/2.0.2/s3cmd-2.0.2.tar.gz tar -xzvf s3cmd-2.0.2.tar.gz cd s3cmd-2.0.2/ python setup.py install which s3cmd s3cmd --version
or
sudo pip3 install s3cmd
Configuration
s3cmd --configure
Stored in...
/root/.s3cfg
Tweaks...
host_base = s3-eu-west-2.amazonaws.com host_bucket = %(bucket)s.s3-eu-west-2.amazonaws.com
Usage
Help...
s3cmd --help
How to sync a folder but only include certain files or folders...
s3cmd --exclude '*' --include '*200901*' --no-progress --stats sync /home/MailScanner/archive/ s3://my.bucket.name/MailScanner/archive/
How to move folders and files but using BASH brace expansion...
s3cmd --dry-run mv --recursive s3://my.bucket.name/MailScanner/archive/200901{22..31} s3://my.bucket.name/MailScanner/archive/2009/
List failed multipart uploads...
s3cmd multipart s3://my.bucket.name/
Delete failed multipart upload...
s3cmd abortmp s3://BUCKET/OBJECT Id
Encrypted Sync
HOWTO:
Static Web Site
Bucket Policy ...
{ "Version": "2012-10-17", "Statement": [ { "Sid": "PublicReadGetObject", "Effect": "Allow", "Principal": "*", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::mybucketname/*" } ] }
https://docs.aws.amazon.com/AmazonS3/latest/userguide/website-hosting-custom-domain-walkthrough.html
Convert a WordPress Web Site to Static HTML hosted on Amazon S3
Part I - S3 with no SSL
- Install the Simply Static plug-in
- Simply Static › Generate
- Download the ZIP file
- Extract the ZIP to your computer
- Create your S3 Bucket with the same name as the web site domain name you want using the instructions from the link above
- Change the permissions as instructed to 'Allow Public Access' and enable 'Static website hosting' using the instructions from the link above
- Upload the index.html file as a 'File' and then any folders as 'Folder' (tip - you have to go in to the folder to select it!)
- Complete the Route 53 part to add an A Record with an ALIAS to the S3 endpoint using the instructions from the link above
- Load the domain name in your web browser and check
Part II - CloudFront with Let's Encrypt SSL
Recover Deleted Objects
How can I retrieve an Amazon S3 object that was deleted in a versioning-enabled bucket?
Disaster Prevention and Recovery
https://blog.cadena-it.com/linux-tips-how-to/how-to-backup-aws-s3-buckets/
Set Lifecycle Rule to EMPTY Bucket
https://repost.aws/knowledge-center/s3-empty-bucket-lifecycle-rule
Set Lifecycle Rule To Delete Failed Multipart Uploads
Versioning
How to remove old versions of objects in S3 when using S3 versioning
CLI aws Install
This is the official AWS command line tool.
Version 2
Uninstall version 1 ...
sudo -H pip3 uninstall awscli
Install version 2 ...
cd /tmp/ curl -s "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" unzip -q awscliv2.zip sudo ./aws/install --bin-dir /usr/local/bin --install-dir /usr/local/aws-cli aws --version
https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
Version 1
sudo -i python3 --version apt install python3-distutils curl -s -O https://bootstrap.pypa.io/get-pip.py python3 get-pip.py pip3 install awscli
BASH Completion - http://docs.aws.amazon.com/cli/latest/userguide/cli-command-completion.html
CLI aws Upgrade
Version 2
cd /tmp/ curl --silent "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" unzip -q awscliv2.zip sudo ./aws/install --bin-dir /usr/local/bin --install-dir /usr/local/aws-cli --update aws --version cd
Version 1
sudo -H pip3 install --upgrade awscli
CLI aws Usage
aws configure
aws configure set default.s3.max_concurrent_requests 20
aws help
aws s3 help
aws s3 ls
aws s3 sync /tmp/foo s3://bucketname/foo
aws ec2 authorize-security-group-ingress --group-name launch-wizard-1 --protocol tcp --port 22 --cidr xxx.xxx.xx.xx/32
CLI_Examples
https://www.thegeekstuff.com/2019/04/aws-s3-cli-examples/
1. To backup photos in your Syncthing directory (dryrun option added for testing)...
aws s3 sync --dryrun --exclude "*" --include "*201701*" /home/user/Syncthing/User/phone/photos/ s3://user.name.bucket/Photos/2017/01/
#!/bin/bash # # script to backup photos (taken the day before) to aws s3 # YEAR=$( date +'%Y' -d "yesterday" ) MONTH=$( date +'%m' -d "yesterday" ) /usr/local/bin/aws s3 sync --exclude "*" --include "*${YEAR}${MONTH}*" /home/user/Syncthing/User/phone/photos/ s3://user.name.bucket/Photos/${YEAR}/${MONTH}/ exit
2. To move objects from one bucket to another bucket, or same bucket but different folder...
aws s3 mv s3://source/file1.txt s3://destination/file2.txt aws s3 mv s3://source/file1.txt s3://source/folder/file1.txt aws --profile profile2 s3 mv --dryrun --recursive --exclude "*" --include "archive-nfs/201502*" s3://source/ s3://destination/archive-nfs/MailArchive/
3. To use a different profile (for different customers)...
nano ~/.aws/credentials [default] aws_access_key_id = XXXXXX aws_secret_access_key = XXXXXXXXXXXXX [customer2] aws_access_key_id = XXXXXX aws_secret_access_key = XXXXXXXXXXXXX
aws --profile customer2 s3 ls
Thanks - http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#cli-multiple-profiles
4. Delete multiple files...
aws --profile customer2 s3 rm --dryrun --recursive --exclude "*" --include "messages" s3://mybucket/folder/ (dryrun) delete: s3://mybucket/folder/subfolder/messages
5. Make bucket...
aws s3 mb s3://mybucket --region eu-west-1
6. Create folder... (the key here is the forward slash / at the end)
aws s3api put-object --bucket mybucket --key dir-test/
7. Create folder and sub folders... (the key here is the forward slash / at the end)
aws s3api put-object --bucket mybucket --key dir-test/subfolder1/subfolder2/
8. Size and Number of Files...
aws s3api list-objects --bucket mybucket --output json --query "[sum(Contents[].Size), length(Contents[])]"
9. Copy files from remote storage to local...
aws s3 cp --recursive s3://mybucket/myFolder/ ./path/to/folder/
10. Restore file from Glacier storage...
aws --profile=default s3api restore-object --bucket mybucket --key foldername/file.zip --restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Standard"}}'
11. Upload folder to S3 as a tar.gz file without compressing locally first using pipes...
tar cvfz - /var/test | aws s3 cp - s3://mybucket/test1.tar.gz
12. Search for a filename...
aws --profile myprofile s3api list-objects --bucket files.mydomain.com --query "Contents[?contains(Key, '*S1255*')]" aws --profile myprofile s3 ls s3://files.mydomain.com/ --recursive |grep 'S1255'
13. Empty bucket but keep CPU usage low about 30% ...
sudo apt install cpulimit /usr/bin/cpulimit -q -b -c 1 -e aws -l 30 nice aws s3 rm --quiet s3://my.bucket.name --recursive
https://github.com/aws/aws-cli/issues/3163
Official Guides
User Guide - http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html
Reference - http://docs.aws.amazon.com/cli/latest/index.html
https://www.thegeekstuff.com/2019/04/aws-s3-cli-examples/
Point in Time Restore
https://github.com/madisoft/s3-pit-restore
CLI Backup and File Versioning
Make S3 bucket...
aws s3 mb s3://bucketname.archive-test
Create S3 bucket archiving policy...
Example 1 = move all objects to Infrequent Access after 30 days, and then to Glacier after 60 days ...
nano aws_lifecycle_30_IA_60_GLACIER.json { "Rules": [ { "Filter": { "Prefix": "" }, "Status": "Enabled", "Transitions": [ { "Days": 30, "StorageClass": "STANDARD_IA" }, { "Days": 60, "StorageClass": "GLACIER" } ], "NoncurrentVersionTransitions": [ { "NoncurrentDays": 30, "StorageClass": "STANDARD_IA" }, { "NoncurrentDays": 60, "StorageClass": "GLACIER" } ], "ID": "30 Days Transfer to IA, 60 Days Transfer to Glacier" } ] }
Example 2 = only keep 1 current version and 1 non-current version of a file after 1 day ...
{ "Rules": [ { "Expiration": { "ExpiredObjectDeleteMarker": true }, "ID": "1 Days Versions", "Filter": {}, "Status": "Enabled", "NoncurrentVersionExpiration": { "NoncurrentDays": 1, "NewerNoncurrentVersions": 1 }, "AbortIncompleteMultipartUpload": { "DaysAfterInitiation": 1 } } ] }
Set S3 bucket archiving policy...
aws s3api put-bucket-lifecycle-configuration --bucket bucketname.archive-test --lifecycle-configuration file://aws_lifecycle_30_IA_60_GLACIER.json
Check S3 bucket archiving policy...
aws s3api get-bucket-lifecycle-configuration --bucket bucketname.archive-test
Set S3 bucket versioning...
aws s3api put-bucket-versioning --bucket bucketname.archive-test --versioning-configuration Status=Enabled
Check S3 bucket versioning...
aws s3api get-bucket-versioning --bucket bucketname.archive-test
Make S3 bucket folder...
aws --profile default s3api put-object --bucket bucketname.archive-test --key Test/
Create local folder and files...
mkdir ~/Test echo "This is the first line and first version of the file." >>~/Test/test.txt
Sync local folder with S3 bucket folder...
aws s3 sync ~/Test/ s3://bucketname.archive-test/Test/
List S3 bucket contents...
aws s3 ls --human-readable --recursive --summarize s3://bucketname.archive-test/Test/
Check S3 bucket versioning...
aws s3api list-object-versions --bucket bucketname.archive-test
Edit the same file again...
echo "This is the second line and second version of the file." >>~/Test/test.txt
Sync local folder with S3 bucket folder again...
aws s3 sync ~/Test/ s3://bucketname.archive-test/Test/
Check S3 bucket versioning...
aws s3api list-object-versions --bucket bucketname.archive-test
Restore particular version of file...
aws s3api get-object --bucket bucketname.archive-test --key Test/test.txt --version-id hsu8723wgjhas ~/Test/test.txt
https://research-it.wharton.upenn.edu/news/aws-s3-glacier-archiving/
CLI s3 sync options
DESCRIPTION Syncs directories and S3 prefixes. Recursively copies new and updated files from the source directory to the destination. Only creates fold- ers in the destination if they contain one or more files.
See 'aws help' for descriptions of global parameters.
SYNOPSIS sync <LocalPath> <S3Uri> or <S3Uri> <LocalPath> or <S3Uri> <S3Uri> [--dryrun] [--quiet] [--include <value>] [--exclude <value>] [--acl <value>] [--follow-symlinks | --no-follow-symlinks] [--no-guess-mime-type] [--sse <value>] [--sse-c <value>] [--sse-c-key <value>] [--sse-kms-key-id <value>] [--sse-c-copy-source <value>] [--sse-c-copy-source-key <value>] [--storage-class <value>] [--grants <value> [<value>...]] [--website-redirect <value>] [--content-type <value>] [--cache-control <value>] [--content-disposition <value>] [--content-encoding <value>] [--content-language <value>] [--expires <value>] [--source-region <value>] [--only-show-errors] [--no-progress] [--page-size <value>] [--ignore-glacier-warnings] [--force-glacier-transfer] [--request-payer <value>] [--metadata <value>] [--metadata-directive <value>] [--size-only] [--exact-timestamps] [--delete]
CLI s3cmd Install
sudo -i cd /root/misc git clone https://github.com/s3tools/s3cmd.git cd s3cmd python setup.py install s3cmd --version exit
Simple Backup Procedure With Retention Policy
https://cloudacademy.com/blog/data-backup-s3cmd/
Encrypted Incremental Backups with S3cmd
Install Error: No module named setuptools
If you receive the following error...
Traceback (most recent call last): File "setup.py", line 7, in <module> from setuptools import setup ImportError: No module named setuptools
...then install the setuptools python module using pip...
sudo -i cd /root/misc wget https://bootstrap.pypa.io/get-pip.py python get-pip.py pip install --upgrade setuptools
Update
sudo -i cd /root/misc/s3cmd git pull python setup.py install s3cmd --version exit
Configure
s3cmd --configure
Tweak Settings
nano ~/.s3cfg
bucket_location = EU host_bucket = %(bucket)s.s3-external-3.amazonaws.com
Create A Bucket
s3cmd mb s3://uniquename.subname.whatever
List Buckets
s3cmd ls
List Contents Of Buckets
s3cmd ls s3://uniquename.subname.whatever/
Create Directory
This is a bit strange but you have to upload a file to the whole folder tree that is not there. It will then create the folders and subfolders as part of the process.
s3cmd put /tmp/test.txt s3://uniquename.subname.whatever/folder/subfolder/test.txt
Upload Files (Test)
s3cmd put --recursive --dry-run ~/folder s3://uniquename.subname.whatever/
Upload Files
s3cmd put --recursive ~/folder s3://uniquename.subname.whatever/
Sync Files
s3cmd sync --verbose ~/folder s3://uniquename.subname.whatever/
Sync File WITH DELETE
This will delete files from S3 that do not exist on your local drive.
This will allow you to clear up your local drive and S3 bucket at the same time.
!! USE WITH CAUTION !!
s3cmd sync --verbose ~/folder --delete-removed s3://uniquename.subname.whatever/
Example: Backup Dovecot Emails Script
#!/bin/bash cd /var/vmail/ && \ /bin/tar -cpf domain.co.uk.tar domain.co.uk && \ /usr/local/bin/s3cmd --quiet put /var/vmail/domain.co.uk.tar s3://domain.co.uk.aws2/var/vmail/ && \ /usr/local/bin/s3cmd ls -H s3://domain.co.uk.aws2/var/vmail/domain.co.uk.tar
Calculate Size of Bucket
Console
Log in > S3 > Click on Bucket > Select All (tickboxes) > More > Get Size
CLI
aws s3 ls --summarize --human-readable --recursive s3://bucket/
s3cmd
s3cmd du s3://bucket/ --human-readable
Restrict Access From An IP Address
{ "Version": "2012-10-17", "Id": "S3PolicyIPRestrict", "Statement": [ { "Sid": "IPAllow", "Effect": "Allow", "Principal": { "AWS": "*" }, "Action": "s3:*", "Resource": "arn:aws:s3:::bucket/*", "Condition" : { "IpAddress" : { "aws:SourceIp": "192.168.143.0/24" }, "NotIpAddress" : { "aws:SourceIp": "192.168.143.188/32" } } } ] }
http://s3tools.org/kb/item10.htm
Android App
http://www.lysesoft.com/products/s3anywhere/
Bucket Policy Examples
https://aws.amazon.com/premiumsupport/knowledge-center/iam-s3-user-specific-folder/
http://s3browser.com/working-with-amazon-s3-bucket-policies.php
Read Only Access To Single Bucket
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:List*", "s3:Get*" ], "Resource": "arn:aws:s3:::bucketname" } ] }
or, longer full version...
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:GetBucketPolicyStatus", "s3:GetBucketPublicAccessBlock", "s3:ListBucketByTags", "s3:GetLifecycleConfiguration", "s3:ListBucketMultipartUploads", "s3:GetBucketTagging", "s3:GetInventoryConfiguration", "s3:GetBucketWebsite", "s3:ListBucketVersions", "s3:GetBucketLogging", "s3:ListBucket", "s3:GetAccelerateConfiguration", "s3:GetBucketVersioning", "s3:GetBucketAcl", "s3:GetBucketNotification", "s3:GetBucketPolicy", "s3:GetReplicationConfiguration", "s3:GetEncryptionConfiguration", "s3:GetBucketRequestPayment", "s3:GetBucketCORS", "s3:GetAnalyticsConfiguration", "s3:GetMetricsConfiguration", "s3:GetBucketLocation" ], "Resource": "arn:aws:s3:::mybucketname" } ] }
Read Only Access To Single Bucket For Specific User
{ "Version": "2012-10-17", "Statement": [ { "Sid": "statement1", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::AccountB-ID:user/Dave" }, "Action": "s3:GetObject", "Resource": "arn:aws:s3:::mybucketname/*" } ] }
Restrict To Single Bucket
Example 1
{ "Version": "2012-10-17", "Statement": [ { "Sid": "ListObjectsInBucket", "Effect": "Allow", "Action": ["s3:ListBucket"], "Resource": ["arn:aws:s3:::bucket-name"] }, { "Sid": "AllObjectActions", "Effect": "Allow", "Action": "s3:*Object", "Resource": ["arn:aws:s3:::bucket-name/*"] } ] }
Example 2
{ "Statement": [ { "Action": "s3:ListAllMyBuckets", "Effect": "Allow", "Resource": "arn:aws:s3:::*" } { "Action": "s3:*", "Effect": "Allow", "Resource": "arn:aws:s3:::mybucketname" } ] }
or
{ "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:GetBucketLocation", "s3:ListBucketMultipartUploads" ], "Resource": "arn:aws:s3:::mybucketname", "Condition": {} }, { "Effect": "Allow", "Action": [ "s3:AbortMultipartUpload", "s3:DeleteObject", "s3:DeleteObjectVersion", "s3:GetObject", "s3:GetObjectAcl", "s3:GetObjectVersion", "s3:GetObjectVersionAcl", "s3:PutObject", "s3:PutObjectAcl", "s3:PutObjectAclVersion" ], "Resource": "arn:aws:s3:::mybucketname/*", "Condition": {} }, { "Effect": "Allow", "Action": "s3:ListAllMyBuckets", "Resource": "*", "Condition": {} } ] }
Event Notifications
SNS
Edit your Topic Policy to allow S3 to publish events to SNS...
{ "Version": "2008-10-17", "Id": "example-ID", "Statement": [ { "Sid": "example-statement-ID", "Effect": "Allow", "Principal": { "AWS":"*" }, "Action": [ "SNS:Publish" ], "Resource": "Topic-ARN", "Condition": { "ArnLike": { "aws:SourceArn": "arn:aws:s3:*:*:bucket.name" } } } ] }
Lifecycle Storage Management with Glacier
Because Amazon S3 maintains the mapping between your user-defined object name and Amazon Glacier’s system-defined identifier, Amazon S3 objects that are stored using the Amazon Glacier option are only accessible through the Amazon S3 APIs or the Amazon S3 Management Console.
To put this in slightly simpler terms, S3 doesn't create Glacier archives that you own or can manipulate. S3 creates Glacier archives that S3 owns and manages.
Your only interface to these objects is through S3, which makes requests to Glacier on your behalf. So, for your questions, the answer for each one is essentially the same:
It doesn't matter. The archives are managed by S3 and are not user-accessible via the Glacier API or console.
https://aws.amazon.com/s3/faqs/#glacier
AWS S3 Lifecycle Storage Management with Glacier
HOWTO: FIX:
S3 CloudFront Error Access Denied
https://serverfault.com/questions/581268/amazon-cloudfront-with-s3-access-denied
WARNING: Redirected To
Replace the bucket_host in the .s3cfg file with the one from the warning.
~/.s3cfg
host_bucket = %(bucket)s.s3-external-3.amazonaws.com
Thanks to ServerFault.com.
THIRD PARTY SOFTWARE
rclone - rsync for cloud storage
CloudBerry Backup
Cloudberry - Simple backup software that stores the data in its simple folder structure.
It has a web based interface available at port 43210 - http://127.0.0.1:43210
Cyberduck
Cyberduck - Mounts the S3 storage in your desktop (Windows or Mac) file browser.
S3 Sync (Windows)
$29.99 USD
Bonkey (The Backup Monkey) (Mac & Windows)
Duplicati (Crossplatform)
Once installed you Duplicati will open a web interface:
http://127.0.0.1:8200/ngax/index.html