How can I identify local files that have already been uploaded to S3 and have not changed since then?

Below script copies all files from folder structure then pastes them into S3 bucket. However, I want it to be able to skip files that have not changed since the last download to avoid duplicate downloads. Does anyone know how I can get if a file exists, check, or was last modified?

Import-Module "C:\Program Files (x86)\AWS Tools\PowerShell\AWSPowerShell\AWSPowerShell.psd1"
$bucket="bucketname"
$source="e:\dfs\*"
$outputpath="C:\temp\log.txt"
$AKey="xxxx"
$SKey="xxxx"

Set-AWSCredentials -AccessKey $AKey -SecretKey $SKey -StoreAs For_Move
Initialize-AWSDefaults -ProfileName For_Move -Region eu-west-1

Start-Transcript -path $outputpath -Force
foreach ($i in Get-ChildItem $source -include *.* -recurse)
{
    if ($i.CreationTime -lt ($(Get-Date).AddDays(-0)))
    {
        $fileName = (Get-ChildItem $i).Name
        $parentFolderName = Split-Path $i -Parent

        Write-S3Object -BucketName $bucket -Key dfs/$parentFolderName/$filename -File $i
    }
}

      

+3


source to share


2 answers


For the very simple "does the file exist?" you can use Get-S3Object

with the same location and test each file before trying to download it.

if (!(Get-S3Object -BucketName $bucket -Key dfs/$parentFolderName/$filename)) {
    Write-S3Object -BucketName $bucket -Key dfs/$parentFolderName/$filename -File $i
}

      



Comparing modified date with last upload is a bit tricky, but you can use the test more:

$localModified = (Get-ItemProperty -Path $fileName).LastWriteTime
$s3Modified = (Get-S3Object -BucketName $bucket -Key $file -Region us-east-1).LastModified | Get-Date

if ($s3Modified -lt $localModified) {
    Write-S3Object -BucketName $bucket -Key dfs/$parentFolderName/$filename -File $i
}

      

+3


source


Combining this, I got the following:

Import-Module "C:\Program Files (x86)\AWS Tools\PowerShell\AWSPowerShell\AWSPowerShell.psd1"
$bucket="<my bucket name>"
$source="C:\dev\project\*"
$outputpath="C:\dev\log.txt"
$AKey="<key>"
$SKey="<secret>"
$region="<my AWS region>"

Set-AWSCredentials -AccessKey $AKey -SecretKey $SKey -StoreAs For_Move
Initialize-AWSDefaults -ProfileName For_Move -Region $region

Start-Transcript -path $outputpath -Force
foreach ($i in Get-ChildItem $source -include *.* -recurse)
{
    if ($i.CreationTime -lt ($(Get-Date).AddDays(-0)))
    {
        $fileName = (Get-ChildItem $i).Name
        $parentFolderName = Split-Path $i -Parent
        $key = "$i"

        $localModified = (Get-ItemProperty -Path $i).LastWriteTime
        $remoteObject = Get-S3Object -BucketName $bucket -Key $key -Region $region
       if($remoteObject -eq $null) {
           Write-S3Object -BucketName $bucket -Key $key -File $i
           "Added new file $i"
       } else {

           $s3Modified = $remoteObject.LastModified | Get-Date

           if ($s3Modified -lt $localModified) {
               Write-S3Object -BucketName $bucket -Key $key -File $i
               "Updated $i"
           }
       }
   }
}

      



Note: This is the first Powershell script I have ever written - so forgive me if I have the style and approach wrong.

+1


source







All Articles