How do I access a file on Amazon S3 from the command line?

Question:

Is there an easy way to access a data file stored on Amazon S3 directly from the command line?

Motivation:

I freely follow an online tutorial where the author links to the following url:

s3://bml-data/churn-bigml-80.csv

      

This is a simple csv file, but I cannot open it with my web browser or with curl

. The tutorial opens it with BigML, but I want to load the data for myself. Some search engines tell me that there are many python and Scala libraries available for S3 access ... but it would be nice to open or load the file more directly.

I use a Mac and am a big fan of homebrew, so the perfect solution (for me) would work on this system.

Bonus question:

Is there a good way to see the contents of an Amazon E3 bucket (which I don't know)?

The nature of the file (80% of the specific dataset) makes me suspect that the file churn-bigml-20.csv

may be hiding out there somewhere. My automatic approach would be to try to spin / open the expected file ... solving the first question would allow me to test this guess, but ugly. If anyone knows of a way to remotely examine the contents of a particular S3 bucket, then that would be very helpful. Again, looking at google and SO tells me there are libraries for this, but a nicer approach would be helpful.

+3


source to share


3 answers


The AWS Command Line Interface (CLI) is a single tool for managing AWS services, including accessing stored data in Amazon S3.

The AWS CLI is available for Windows, Mac, and Linux.

If the bucket owner has granted public permissions for ListBucket

, you can list the contents of the bucket, for example:

aws s3 ls s3://bml-data

      

If the bucket owner has granted public permissions for GetObject

, you can copy the object:



aws s3 cp s3://bml-data/churn-bigml-80.csv churn-bigml-80.csv

      

Both of these commands work successfully for me.

See also:

+7


source


There will be a neat tool called s3cmd

that will do this.

  • It works on Mac (with homebrew package manager)
  • It allows you to boot from Amazon S3 to your local machine.
  • It lets you browse Amazon S3 buckets (even if you don't own them).

Installation and configuration

brew install s3cmd

      

Setting up s3cmd requires an amazon s3 account. It's free, but you need to sign up for it here .

s3cmd --configure

      

The configuration includes specifying an access / secret key pair and a few other details (I used the defaults for everything). If you want to use HTTPS, you can install gpg

with brew and set a few more configuration options at this point. Be careful - the gpg_passphrase you are using is stored in your local plain text config file!

Using:

Now for the exciting bit: uploading my file to your desktop!



s3cmd get s3://bml-data/churn-bigml-80.csv ~/Desktop

      

Listing the contents of the remote bucket:

s3cmd ls s3://bml-data/

      

Additional functionality:

This is beyond the scope of the question, but it seems worth mentioning: s3cmd can do other things like put

data to the bucket (and publish it using the -P flag), delete files, and show a manual for more information:

s3cmd -P put ~/Desktop/my-file.png  s3://mybucket/
s3cmd del s3://mybucket/my-file-to-delete.png
man s3cmd

      

Credit:

Thanks to Neil Gee for his s3cmd tutorial.

+1


source


If you just want to download the file in linux terminal, you must make the file publicly available.

FYI: everyone will have access to one or all of the following: read this object, read and write permissions.

Once this is done. Right click on the file -> Download as -> then you can see a popup.

Right-click the download link and select "Copy Link Location" then paste it into notepad. Then select the link before the question mark.

https://s3-ap-nrtheast-1.amazonws.com/backup/pan.hosting/2017-01-15/earth.tar.gz?response-content-disposition=attachment&X-Amz-Security-Token=%2F % 2F% 2F% 2F% 2F% 2F% 2F% 2F% 2F% 2FwTEigAJ4GvimzYt3gQegUHaRSe% 2BnLWeND%

Then enter the command below into your command terminal.

wget https://s3-ap-nrtheast-1.amazonws.com/backup/pan.hosting/2017-01-15/earth.tar.gz

0


source







All Articles