AWS S3

From Indie IT Wiki

S3 or Amazon Simple Storage Service provides a simple web-services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, secure, fast, inexpensive infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.

INFO:

AWS Free Tier

Includes 5GB storage, 20,000 Get Requests, and 2,000 Put Requests with Amazon S3.

http://aws.amazon.com/s3/

Introduction

http://amzn.to/1rlFqoH

Pricing

Standard 0.02 per GB
Infrequent Access 0.01 per GB
Glacier 0.005 per GB
Deep Glacier 0.00099 per GB

http://aws.amazon.com/s3/pricing/

http://calculator.s3.amazonaws.com/index.html

https://www.skeddly.com/tools/ebs-snapshot-cost-calculator/

Optimising Costs

Optimise Your Amazon S3 Costs

How to analyze and reduce S3 storage usage

Costs Example

Monthly cost of storing in S3 Standard: 753.27 GB x $0.024/GB = $18.08

Monthly cost of storing in S3 Glacier: $3.67
  753.27 GB x $0.0045/GB = $3.39
  Glacier overhead: 855953 objects * 32 KB * $0.0045/GB = $0.12
  S3 overhead: 855953 objects * 8 KB * $0.024/GB = $0.16

https://alestic.com/2012/12/s3-glacier-costs/

s3cmd

S3cmd is a tool for managing objects in Amazon S3 storage. It allows for making and removing "buckets" and uploading, downloading and removing "objects" from these buckets. It runs on Linux and Mac.

http://s3tools.org

Installation

sudo -i
apt-get install python3-setuptools
wget https://netcologne.dl.sourceforge.net/project/s3tools/s3cmd/2.0.2/s3cmd-2.0.2.tar.gz
tar -xzvf s3cmd-2.0.2.tar.gz
cd s3cmd-2.0.2/
python setup.py install
which s3cmd
s3cmd --version

or

sudo pip3 install s3cmd

Configuration

s3cmd --configure

Stored in...

/root/.s3cfg

Tweaks...

host_base = s3-eu-west-2.amazonaws.com
host_bucket = %(bucket)s.s3-eu-west-2.amazonaws.com

https://awsregion.info/

Usage

Help...

s3cmd --help

How to sync a folder but only include certain files or folders...

s3cmd --exclude '*' --include '*200901*' --no-progress --stats sync /home/MailScanner/archive/ s3://my.bucket.name/MailScanner/archive/

How to move folders and files but using BASH brace expansion...

s3cmd --dry-run mv --recursive s3://my.bucket.name/MailScanner/archive/200901{22..31} s3://my.bucket.name/MailScanner/archive/2009/

List failed multipart uploads...

s3cmd multipart s3://my.bucket.name/

Delete failed multipart upload...

s3cmd abortmp s3://BUCKET/OBJECT Id

Encrypted Sync

https://www.bentasker.co.uk/documentation/linux/285-implementing-encrypted-incremental-backups-with-s3cmd

HOWTO:

Static Web Site

Bucket Policy ...

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "PublicReadGetObject",
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::mybucketname/*"
        }
    ]
}

https://docs.aws.amazon.com/AmazonS3/latest/userguide/website-hosting-custom-domain-walkthrough.html

Convert a WordPress Web Site to Static HTML hosted on Amazon S3

Part I - S3 with no SSL

  • Install the Simply Static plug-in
  • Simply Static › Generate
  • Download the ZIP file
  • Extract the ZIP to your computer
  • Create your S3 Bucket with the same name as the web site domain name you want using the instructions from the link above
  • Change the permissions as instructed to 'Allow Public Access' and enable 'Static website hosting' using the instructions from the link above
  • Upload the index.html file as a 'File' and then any folders as 'Folder' (tip - you have to go in to the folder to select it!)
  • Complete the Route 53 part to add an A Record with an ALIAS to the S3 endpoint using the instructions from the link above
  • Load the domain name in your web browser and check

Part II - CloudFront with Let's Encrypt SSL

https://trycatchfinally.dev/posts/how-to-use-letsencrypt-ssl-cert-to-secure-custom-domain-with-aws-cloudfront/

Recover Deleted Objects

How can I retrieve an Amazon S3 object that was deleted in a versioning-enabled bucket?

Disaster Prevention and Recovery

https://blog.cadena-it.com/linux-tips-how-to/how-to-backup-aws-s3-buckets/

Set Lifecycle Rule to EMPTY Bucket

https://repost.aws/knowledge-center/s3-empty-bucket-lifecycle-rule

Set Lifecycle Rule To Delete Failed Multipart Uploads

https://stackoverflow.com/questions/39457458/howto-abort-all-incomplete-multipart-uploads-for-a-bucket

Versioning

Versioning in S3

How to remove old versions of objects in S3 when using S3 versioning

CLI aws Install

This is the official AWS command line tool.

CHANGELOG

Version 2

cd /tmp/
curl -s "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip -q awscliv2.zip
sudo ./aws/install --bin-dir /usr/local/bin --install-dir /usr/local/aws-cli
aws --version

https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html

Version 1

sudo -i
python3 --version
apt install python3-distutils
curl -s -O https://bootstrap.pypa.io/get-pip.py
python3 get-pip.py
pip3 install awscli

BASH Completion - http://docs.aws.amazon.com/cli/latest/userguide/cli-command-completion.html

CLI aws Upgrade

CHANGELOG

Version 2

cd /tmp/
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip -q awscliv2.zip
sudo ./aws/install --bin-dir /usr/local/bin --install-dir /usr/local/aws-cli --update
aws --version
cd

Version 1

sudo -H pip3 install --upgrade awscli

CLI aws Usage

aws configure
aws configure set default.s3.max_concurrent_requests 20
aws help
aws s3 help
aws s3 ls
aws s3 sync /tmp/foo s3://bucketname/foo
aws ec2 authorize-security-group-ingress --group-name launch-wizard-1 --protocol tcp --port 22 --cidr xxx.xxx.xx.xx/32

CLI_Examples

https://www.thegeekstuff.com/2019/04/aws-s3-cli-examples/

1. To backup photos in your Syncthing directory (dryrun option added for testing)...

aws s3 sync --dryrun --exclude "*" --include "*201701*" /home/user/Syncthing/User/phone/photos/ s3://user.name.bucket/Photos/2017/01/
#!/bin/bash
#
# script to backup photos (taken the day before) to aws s3
#
YEAR=$( date +'%Y' -d "yesterday" )
MONTH=$( date +'%m' -d "yesterday" )
/usr/local/bin/aws s3 sync --exclude "*" --include "*${YEAR}${MONTH}*" /home/user/Syncthing/User/phone/photos/ s3://user.name.bucket/Photos/${YEAR}/${MONTH}/
exit

2. To move objects from one bucket to another bucket, or same bucket but different folder...

aws s3 mv s3://source/file1.txt s3://destination/file2.txt

aws s3 mv s3://source/file1.txt s3://source/folder/file1.txt

aws --profile profile2 s3 mv --dryrun --recursive --exclude "*" --include "archive-nfs/201502*" s3://source/ s3://destination/archive-nfs/MailArchive/

3. To use a different profile (for different customers)...

nano ~/.aws/credentials

[default]
aws_access_key_id = XXXXXX
aws_secret_access_key = XXXXXXXXXXXXX

[customer2]
aws_access_key_id = XXXXXX
aws_secret_access_key = XXXXXXXXXXXXX
aws --profile customer2 s3 ls

Thanks - http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#cli-multiple-profiles

4. Delete multiple files...

aws --profile customer2 s3 rm --dryrun --recursive --exclude "*" --include "messages" s3://mybucket/folder/
(dryrun) delete: s3://mybucket/folder/subfolder/messages

5. Make bucket...

aws s3 mb s3://mybucket --region eu-west-1

6. Create folder... (the key here is the forward slash / at the end)

aws s3api put-object --bucket mybucket --key dir-test/

7. Create folder and sub folders... (the key here is the forward slash / at the end)

aws s3api put-object --bucket mybucket --key dir-test/subfolder1/subfolder2/

8. Size and Number of Files...

aws s3api list-objects --bucket mybucket --output json --query "[sum(Contents[].Size), length(Contents[])]"

9. Copy files from remote storage to local...

aws s3 cp --recursive s3://mybucket/myFolder/ ./path/to/folder/

10. Restore file from Glacier storage...

aws --profile=default s3api restore-object --bucket mybucket --key foldername/file.zip --restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Standard"}}'

11. Upload folder to S3 as a tar.gz file without compressing locally first using pipes...

tar cvfz - /var/test | aws s3 cp - s3://mybucket/test1.tar.gz

12. Search for a filename...

aws --profile myprofile s3api list-objects --bucket files.mydomain.com --query "Contents[?contains(Key, '*S1255*')]"
aws --profile myprofile s3 ls s3://files.mydomain.com/ --recursive |grep 'S1255'

13. Empty bucket but keep CPU usage low about 30% ...

sudo apt install cpulimit
/usr/bin/cpulimit -q -b -c 1 -e aws -l 30
nice aws s3 rm --quiet s3://my.bucket.name --recursive

https://github.com/aws/aws-cli/issues/3163

Official Guides

User Guide - http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html

Reference - http://docs.aws.amazon.com/cli/latest/index.html

https://www.thegeekstuff.com/2019/04/aws-s3-cli-examples/

Point in Time Restore

https://github.com/madisoft/s3-pit-restore

CLI Backup and File Versioning

Make S3 bucket...

aws s3 mb s3://bucketname.archive-test

Create S3 bucket archiving policy...

Example 1 = move all objects to Infrequent Access after 30 days, and then to Glacier after 60 days ...

nano aws_lifecycle_30_IA_60_GLACIER.json

{
    "Rules": [
        {
            "Filter": {
                "Prefix": ""
            },
            "Status": "Enabled",
            "Transitions": [
                {
                    "Days": 30,
                    "StorageClass": "STANDARD_IA"
                },
                {
                    "Days": 60,
                    "StorageClass": "GLACIER"
                }
            ],
            "NoncurrentVersionTransitions": [
                {
                    "NoncurrentDays": 30,
                    "StorageClass": "STANDARD_IA"
                },
                {
                    "NoncurrentDays": 60,
                    "StorageClass": "GLACIER"
                }
            ],
            "ID": "30 Days Transfer to IA, 60 Days Transfer to Glacier"
        }
    ]
}

Example 2 = only keep 1 current version and 1 non-current version of a file after 1 day ...

{
    "Rules": [
        {
            "Expiration": {
                "ExpiredObjectDeleteMarker": true
            },
            "ID": "1 Days Versions",
            "Filter": {},
            "Status": "Enabled",
            "NoncurrentVersionExpiration": {
                "NoncurrentDays": 1,
                "NewerNoncurrentVersions": 1
            },
            "AbortIncompleteMultipartUpload": {
                "DaysAfterInitiation": 1
            }
        }
    ]
}

Set S3 bucket archiving policy...

aws s3api put-bucket-lifecycle-configuration --bucket bucketname.archive-test --lifecycle-configuration file://aws_lifecycle_30_IA_60_GLACIER.json

Check S3 bucket archiving policy...

aws s3api get-bucket-lifecycle-configuration --bucket bucketname.archive-test

Set S3 bucket versioning...

aws s3api put-bucket-versioning --bucket bucketname.archive-test --versioning-configuration Status=Enabled

Check S3 bucket versioning...

aws s3api get-bucket-versioning --bucket bucketname.archive-test

Make S3 bucket folder...

aws --profile default s3api put-object --bucket bucketname.archive-test --key Test/

Create local folder and files...

mkdir ~/Test
echo "This is the first line and first version of the file." >>~/Test/test.txt

Sync local folder with S3 bucket folder...

aws s3 sync ~/Test/ s3://bucketname.archive-test/Test/

List S3 bucket contents...

aws s3 ls --human-readable --recursive --summarize s3://bucketname.archive-test/Test/

Check S3 bucket versioning...

aws s3api list-object-versions --bucket bucketname.archive-test

Edit the same file again...

echo "This is the second line and second version of the file." >>~/Test/test.txt

Sync local folder with S3 bucket folder again...

aws s3 sync ~/Test/ s3://bucketname.archive-test/Test/

Check S3 bucket versioning...

aws s3api list-object-versions --bucket bucketname.archive-test

Restore particular version of file...

aws s3api get-object --bucket bucketname.archive-test --key Test/test.txt --version-id hsu8723wgjhas ~/Test/test.txt

https://research-it.wharton.upenn.edu/news/aws-s3-glacier-archiving/

CLI s3 sync options

DESCRIPTION
      Syncs  directories  and S3 prefixes. Recursively copies new and updated
      files from the source directory to the destination. Only creates  fold-
      ers in the destination if they contain one or more files.
      See 'aws help' for descriptions of global parameters.
SYNOPSIS
           sync
         <LocalPath> <S3Uri> or <S3Uri> <LocalPath> or <S3Uri> <S3Uri>
         [--dryrun]
         [--quiet]
         [--include <value>]
         [--exclude <value>]
         [--acl <value>]
         [--follow-symlinks | --no-follow-symlinks]
         [--no-guess-mime-type]
         [--sse <value>]
         [--sse-c <value>]
         [--sse-c-key <value>]
         [--sse-kms-key-id <value>]
         [--sse-c-copy-source <value>]
         [--sse-c-copy-source-key <value>]
         [--storage-class <value>]
         [--grants <value> [<value>...]]
         [--website-redirect <value>]
         [--content-type <value>]
         [--cache-control <value>]
         [--content-disposition <value>]
         [--content-encoding <value>]
         [--content-language <value>]
         [--expires <value>]
         [--source-region <value>]
         [--only-show-errors]
         [--no-progress]
         [--page-size <value>]
         [--ignore-glacier-warnings]
         [--force-glacier-transfer]
         [--request-payer <value>]
         [--metadata <value>]
         [--metadata-directive <value>]
         [--size-only]
         [--exact-timestamps]
         [--delete]

CLI s3cmd Install

sudo -i
cd /root/misc
git clone https://github.com/s3tools/s3cmd.git
cd s3cmd
python setup.py install
s3cmd --version
exit

Simple Backup Procedure With Retention Policy

https://cloudacademy.com/blog/data-backup-s3cmd/

Encrypted Incremental Backups with S3cmd

https://www.bentasker.co.uk/documentation/linux/285-implementing-encrypted-incremental-backups-with-s3cmd

Install Error: No module named setuptools

If you receive the following error...

Traceback (most recent call last):
  File "setup.py", line 7, in <module>
    from setuptools import setup
ImportError: No module named setuptools

...then install the setuptools python module using pip...

sudo -i
cd /root/misc
wget https://bootstrap.pypa.io/get-pip.py
python get-pip.py
pip install --upgrade setuptools

Update

sudo -i
cd /root/misc/s3cmd
git pull
python setup.py install
s3cmd --version
exit

Configure

s3cmd --configure

Tweak Settings

nano ~/.s3cfg
bucket_location = EU
host_bucket = %(bucket)s.s3-external-3.amazonaws.com

Create A Bucket

s3cmd mb s3://uniquename.subname.whatever

List Buckets

s3cmd ls

List Contents Of Buckets

s3cmd ls s3://uniquename.subname.whatever/

Create Directory

This is a bit strange but you have to upload a file to the whole folder tree that is not there. It will then create the folders and subfolders as part of the process.

s3cmd put /tmp/test.txt s3://uniquename.subname.whatever/folder/subfolder/test.txt

Upload Files (Test)

s3cmd put --recursive --dry-run ~/folder s3://uniquename.subname.whatever/

Upload Files

s3cmd put --recursive ~/folder s3://uniquename.subname.whatever/

Sync Files

s3cmd sync --verbose ~/folder s3://uniquename.subname.whatever/

Sync File WITH DELETE

This will delete files from S3 that do not exist on your local drive.

This will allow you to clear up your local drive and S3 bucket at the same time.

!! USE WITH CAUTION !!

s3cmd sync --verbose ~/folder --delete-removed s3://uniquename.subname.whatever/

Example: Backup Dovecot Emails Script

#!/bin/bash
cd /var/vmail/ && \
/bin/tar -cpf domain.co.uk.tar domain.co.uk && \
/usr/local/bin/s3cmd --quiet put /var/vmail/domain.co.uk.tar s3://domain.co.uk.aws2/var/vmail/ && \
/usr/local/bin/s3cmd ls -H s3://domain.co.uk.aws2/var/vmail/domain.co.uk.tar

Calculate Size of Bucket

Console

Log in > S3 > Click on Bucket > Select All (tickboxes) > More > Get Size

CLI

aws s3 ls --summarize --human-readable --recursive s3://bucket/

s3cmd

s3cmd du s3://bucket/ --human-readable

Restrict Access From An IP Address

{
   "Version": "2012-10-17",
   "Id": "S3PolicyIPRestrict",
   "Statement": [
       {
           "Sid": "IPAllow",
           "Effect": "Allow",
           "Principal": {
               "AWS": "*"
           },
           "Action": "s3:*",
           "Resource": "arn:aws:s3:::bucket/*",
           "Condition" : {
               "IpAddress" : {
                   "aws:SourceIp": "192.168.143.0/24"
               },
               "NotIpAddress" : {
                   "aws:SourceIp": "192.168.143.188/32"
               }
           }
       }
   ]
}

https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html#example-bucket-policies-use-case-3

http://s3tools.org/kb/item10.htm

Android App

http://www.lysesoft.com/products/s3anywhere/

Bucket Policy Examples

https://aws.amazon.com/premiumsupport/knowledge-center/iam-s3-user-specific-folder/

http://s3browser.com/working-with-amazon-s3-bucket-policies.php

Read Only Access To Single Bucket

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:List*",
                "s3:Get*"
            ],
            "Resource": "arn:aws:s3:::bucketname"
        }
    ]
}

or, longer full version...

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketPolicyStatus",
                "s3:GetBucketPublicAccessBlock",
                "s3:ListBucketByTags",
                "s3:GetLifecycleConfiguration",
                "s3:ListBucketMultipartUploads",
                "s3:GetBucketTagging",
                "s3:GetInventoryConfiguration",
                "s3:GetBucketWebsite",
                "s3:ListBucketVersions",
                "s3:GetBucketLogging",
                "s3:ListBucket",
                "s3:GetAccelerateConfiguration",
                "s3:GetBucketVersioning",
                "s3:GetBucketAcl",
                "s3:GetBucketNotification",
                "s3:GetBucketPolicy",
                "s3:GetReplicationConfiguration",
                "s3:GetEncryptionConfiguration",
                "s3:GetBucketRequestPayment",
                "s3:GetBucketCORS",
                "s3:GetAnalyticsConfiguration",
                "s3:GetMetricsConfiguration",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::mybucketname"
        }
    ]
}

Read Only Access To Single Bucket For Specific User

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "statement1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::AccountB-ID:user/Dave"
            },
            "Action":   "s3:GetObject",
            "Resource": "arn:aws:s3:::mybucketname/*"
        }
    ]
}

Restrict To Single Bucket

Example 1


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ListObjectsInBucket",
            "Effect": "Allow",
            "Action": ["s3:ListBucket"],
            "Resource": ["arn:aws:s3:::bucket-name"]
        },
        {
            "Sid": "AllObjectActions",
            "Effect": "Allow",
            "Action": "s3:*Object",
            "Resource": ["arn:aws:s3:::bucket-name/*"]
        }
    ]
}

Example 2

{
   "Statement": [
       {
           "Action": "s3:ListAllMyBuckets",
           "Effect": "Allow",
           "Resource": "arn:aws:s3:::*"
       }
       {
           "Action": "s3:*",
           "Effect": "Allow",
           "Resource": "arn:aws:s3:::mybucketname"
       }
   ]
}

or

{
 "Statement": [
   {
     "Effect": "Allow",
     "Action": [
       "s3:ListBucket",
       "s3:GetBucketLocation",
       "s3:ListBucketMultipartUploads"
     ],
     "Resource": "arn:aws:s3:::mybucketname",
     "Condition": {}
   },
   {
     "Effect": "Allow",
     "Action": [
       "s3:AbortMultipartUpload",
       "s3:DeleteObject",
       "s3:DeleteObjectVersion",
       "s3:GetObject",
       "s3:GetObjectAcl",
       "s3:GetObjectVersion",
       "s3:GetObjectVersionAcl",
       "s3:PutObject",
       "s3:PutObjectAcl",
       "s3:PutObjectAclVersion"
     ],
     "Resource": "arn:aws:s3:::mybucketname/*",
     "Condition": {}
   },
   {
     "Effect": "Allow",
     "Action": "s3:ListAllMyBuckets",
     "Resource": "*",
     "Condition": {}
   }
 ]
}

Event Notifications

SNS

Edit your Topic Policy to allow S3 to publish events to SNS...

{
"Version": "2008-10-17",
"Id": "example-ID",
"Statement": [
 {
  "Sid": "example-statement-ID",
  "Effect": "Allow",
  "Principal": {
   "AWS":"*"  
  },
  "Action": [
   "SNS:Publish"
  ],
  "Resource": "Topic-ARN",
  "Condition": {
     "ArnLike": {          
     "aws:SourceArn": "arn:aws:s3:*:*:bucket.name"    
   }
  }
 }
]
}

Lifecycle Storage Management with Glacier

Because Amazon S3 maintains the mapping between your user-defined object name and Amazon Glacier’s system-defined identifier, Amazon S3 objects that are stored using the Amazon Glacier option are only accessible through the Amazon S3 APIs or the Amazon S3 Management Console.

To put this in slightly simpler terms, S3 doesn't create Glacier archives that you own or can manipulate. S3 creates Glacier archives that S3 owns and manages.

Your only interface to these objects is through S3, which makes requests to Glacier on your behalf. So, for your questions, the answer for each one is essentially the same:

It doesn't matter. The archives are managed by S3 and are not user-accessible via the Glacier API or console.

https://aws.amazon.com/s3/faqs/#glacier

Thanks

AWS S3 Lifecycle Storage Management with Glacier

HOWTO: FIX:

S3 CloudFront Error Access Denied

https://serverfault.com/questions/581268/amazon-cloudfront-with-s3-access-denied

WARNING: Redirected To

Replace the bucket_host in the .s3cfg file with the one from the warning.

~/.s3cfg

host_bucket = %(bucket)s.s3-external-3.amazonaws.com

Thanks to ServerFault.com.

THIRD PARTY SOFTWARE

rclone - rsync for cloud storage

Archiving to Cloud Storage

CloudBerry Backup

Cloudberry - Simple backup software that stores the data in its simple folder structure.

It has a web based interface available at port 43210 - http://127.0.0.1:43210

Cyberduck

Cyberduck - Mounts the S3 storage in your desktop (Windows or Mac) file browser.

S3 Sync (Windows)

Sprightly Soft S3Sync

$29.99 USD

Bonkey (The Backup Monkey) (Mac & Windows)

Home Page.

Duplicati (Crossplatform)

http://www.duplicati.com/

Once installed you Duplicati will open a web interface:

http://127.0.0.1:8200/ngax/index.html

ARQ Backup (MAC)

https://www.arqbackup.com/