Can I make this script even faster?

I wrote a simple internship script that pushes through the provided directory and removes any file older than a specified number of days. Today I spent all my free time trying to drag it out. Here's what I have so far:

function delOld($dir, $numDays){
    $timespan = new-timespan -days $numDays
    $curTime = get-date
    get-childItem $dir -Recurse -file | 
    where-object {(($curTime)-($_.LastWriteTime)) -gt $timespan} | 
    remove-Item -whatif
}

      

Here's an example of a function call:

delOld -dir "C:\Users\me\Desktop\psproject" -numDays 5

      

Sorry for the difficulty of reading, I found that compressing the operations into one line was more efficient than reassigning them to legible variables on each iteration. Currently, the remove element is used to remove. I know that at the moment I probably cannot speed it up, however, I am running it on top of TB files, so every operation counts.

Thanks in advance for any advice you can offer!

+2


source to share


3 answers


Remaining in the realm of PowerShell and .NET techniques, here's how you can speed up your work:

  • Calculate the cutoff timestamp once forward.

  • Use [IO.DirectoryInfo]

    EnumerateFiles()

    type [IO.DirectoryInfo]

    (PSv3 + /. NET4 +) in combination with the operator foreach

    . Hat tip to WOxxOm .

    • EnumerateFiles()

      enumerates files one at a time, keeping memory usage constant, similar but faster than Get-ChildItem

      .

      • Caveats:

        • EnumerateFiles()

          invariably includes hidden files, whereas Get-ChildItem

          excludes them by default and only includes them if -Force

          .

        • EnumerateFiles()

          not suitable if there is a chance of encountering inaccessible directories due to lack of permissions, because even if you include the entire statement foreach

          in the try

          / block catch

          , you will only get partial output, given that iteration stops when the first inaccessible directory is encountered.

        • The order of listing may differ from Get-ChildItem

          .

    • The PowerShell operator is foreach

      much faster than the cmdlet ForEach-Object

      and also faster than the PSv4 + collection operator .ForEach()

      .

  • .Delete()

    method .Delete()

    directly for each instance [System.IO.FileInfo]

    inside the loop body.

Note. For brevity, the function below does not check for errors, such as whether it is a $numDays

valid value and whether it refers $dir

to an existing directory (if it is a path based on a custom PS drive, you should have to solve this with first Convert-Path

).



function delOld($dir, $numDays) {
    $dtCutoff = [datetime]::now - [timespan]::FromDays($numDays)
    # Make sure that the .NET framework current dir. is the same as PS's:
    [System.IO.Directory]::SetCurrentDirectory($PWD.ProviderPath)
    # Enumerate all files recursively.
    # Replace $file.FullName with $file.Delete() to perform actual deletion.
    foreach ($file in ([IO.DirectoryInfo] $dir).EnumerateFiles('*', 'AllDirectories')) { 
     if ($file.LastWriteTime -lt $dtCutOff) { $file.FullName }
    }
}

      

Note: the above just prints the paths of the files to be deleted; replace $file.FullName

with $file.Delete()

for actual removal.

+5


source


Many of the PowerShell cmdlets are slower than their .NET equivalents. For example, you can make a call [System.IO.File]::Delete($_.FullName)

and see if there is a performance difference. The same goes for Get-ChildItem

=> [System.IO.Directory]::GetFiles(...)

.

To do this, I will write a small script that creates two temporary folders with, say, 100,000 empty test files in each. Then call each version of the function wrapped in [System.Diagnostics.StopWatch]

.

Sample code:



$stopwatch = New-Object 'System.Diagnostics.StopWatch'
$stopwatch.Start()

Remove-OldItems1 ...

$stopwatch.Stop()
Write-Host $stopwatch.ElapsedMilliseconds

$stopwatch.Reset()
$stopwatch.Start()

Remove-OldItems2 ...

$stopwatch.Stop()
Write-Host $stopwatch.ElapsedMilliseconds

      

Additional brown dots for PowerShell: Run Get-Verb

in a Powershell window and you will see a list of approved verbs. Functions in PowerShell are supposed to be named Verb-Noun

, so something seems Remove-OldItems

to fit the bill.

+1


source


This will delete everything in parallel.

workflow delOld([string]$dir, [int]$numDays){
    $timespan = new-timespan -days $numDays
    $curTime = get-date
    $Files = get-childItem $dir -Recurse -file | where-object {(($curTime)-($_.LastWriteTime)) -gt $timespan}
    foreach -parallel ($file in $files){
        Remove-Item $File
    }

}

delOld -dir "C:\Users\AndrewD\Downloads" -numDays 8

      

Now if its many folders try this

+1


source







All Articles