Download Large Number of Files Using the Amazon S3 Bucket Java SDK

Question

Download Large Number of Files Using the Amazon S3 Bucket Java SDK

I have a large number of files to download from the S3 bucket. My problem is similar to this article , except I'm trying to run it in Java.

public static void main(String args[]) {
        AWSCredentials myCredentials = new BasicAWSCredentials("key","secret");
        TransferManager tx = new TransferManager(myCredentials);
        File file = <thefile>
        try{
        MultipleFileDownload myDownload = tx.downloadDirectory("<bucket>", null, file);
        System.out.println("Transfer: " + myDownload.getDescription());
        System.out.println("  - State: " + myDownload.getState());
        System.out.println("  - Progress: " + myDownload.getProgress().getBytesTransfered());

        while (myDownload.isDone() == false) {
           System.out.println("Transfer: " + myDownload.getDescription());
           System.out.println("  - State: " + myDownload.getState());
            System.out.println("  - Progress: " + myDownload.getProgress().getBytesTransfered());
            try {
                // Do work while we wait for our upload to complete...
                Thread.sleep(500);
            } catch (InterruptedException ex) {
                ex.printStackTrace();
            }
         }
         } catch(Exception e){
          e.printStackTrace();
         }

      }

This has been adapted from the TransferManager class example for multiple uploads. This bucket contains over 100,000 objects. Any help would be great.

+3

java amazon amazon-s3 amazon-web-services download

derigible Jan 26. At 17:21

source to share

2 answers

user846969 · Answer 1 · 2013-09-03T20:37:41+0000

Please use the list () method to get a list of your files and then use the get () method to get each file.

class S3 extends AmazonS3Client {

    final String bucket;


    S3(String u, String p, String Bucket) {
        super(new BasicAWSCredentials(u, p));
        bucket = Bucket;
    }


    String get(String k) {
        try {
            final S3Object f = getObject(bucket, k);
            final BufferedInputStream i = new BufferedInputStream(f.getObjectContent());
            final StringBuilder s = new StringBuilder();
            final byte[] b = new byte[1024];
            for (int n = i.read(b); n != -1; n = i.read(b)) {
                s.append(new String(b, 0, n));
            }
            return s.toString();
        } catch (Exception e) {
            log("Cannot get " + bucket + "/" + k + " from S3 because " + e);
        }
        return null;
    }


    String[] list(String d) {
        try {
            final ObjectListing l = listObjects(bucket, d);
            final List<S3ObjectSummary> L = l.getObjectSummaries();
            final int n = L.size();
            final String[] s = new String[n];
            for (int i = 0; i < n; ++i) {
                final S3ObjectSummary k = L.get(i);
                s[i] = k.getKey();
            }
            return s;
        } catch (Exception e) {
            log("Cannot list " + bucket + "/" + d + " on S3 because " + e);
        }
        return new String[]{};
    }
}

shivarajan · Answer 2 · 2015-04-20T01:31:23+0000

The TransferManager internally uses the countdownlatch function, which leads me to believe it is a concurrent download (which seems like the right way to do it). Does it make more sense to use it rather than getting one file after another in sequence?

Download Large Number of Files Using the Amazon S3 Bucket Java SDK

More articles: