Can I make this script even faster?
I wrote a simple internship script that pushes through the provided directory and removes any file older than a specified number of days. Today I spent all my free time trying to drag it out. Here's what I have so far:
function delOld($dir, $numDays){
$timespan = new-timespan -days $numDays
$curTime = get-date
get-childItem $dir -Recurse -file |
where-object {(($curTime)-($_.LastWriteTime)) -gt $timespan} |
remove-Item -whatif
}
Here's an example of a function call:
delOld -dir "C:\Users\me\Desktop\psproject" -numDays 5
Sorry for the difficulty of reading, I found that compressing the operations into one line was more efficient than reassigning them to legible variables on each iteration. Currently, the remove element is used to remove. I know that at the moment I probably cannot speed it up, however, I am running it on top of TB files, so every operation counts.
Thanks in advance for any advice you can offer!
source to share
Remaining in the realm of PowerShell and .NET techniques, here's how you can speed up your work:
-
Calculate the cutoff timestamp once forward.
-
Use
[IO.DirectoryInfo]
EnumerateFiles()
type[IO.DirectoryInfo]
(PSv3 + /. NET4 +) in combination with the operatorforeach
. Hat tip to WOxxOm .-
EnumerateFiles()
enumerates files one at a time, keeping memory usage constant, similar but faster thanGet-ChildItem
.-
Caveats:
-
EnumerateFiles()
invariably includes hidden files, whereasGet-ChildItem
excludes them by default and only includes them if-Force
. -
EnumerateFiles()
not suitable if there is a chance of encountering inaccessible directories due to lack of permissions, because even if you include the entire statementforeach
in thetry
/ blockcatch
, you will only get partial output, given that iteration stops when the first inaccessible directory is encountered. -
The order of listing may differ from
Get-ChildItem
.
-
-
-
The PowerShell operator is
foreach
much faster than the cmdletForEach-Object
and also faster than the PSv4 + collection operator.ForEach()
.
-
-
.Delete()
method.Delete()
directly for each instance[System.IO.FileInfo]
inside the loop body.
Note. For brevity, the function below does not check for errors, such as whether it is a $numDays
valid value and whether it refers $dir
to an existing directory (if it is a path based on a custom PS drive, you should have to solve this with first Convert-Path
).
function delOld($dir, $numDays) {
$dtCutoff = [datetime]::now - [timespan]::FromDays($numDays)
# Make sure that the .NET framework current dir. is the same as PS's:
[System.IO.Directory]::SetCurrentDirectory($PWD.ProviderPath)
# Enumerate all files recursively.
# Replace $file.FullName with $file.Delete() to perform actual deletion.
foreach ($file in ([IO.DirectoryInfo] $dir).EnumerateFiles('*', 'AllDirectories')) {
if ($file.LastWriteTime -lt $dtCutOff) { $file.FullName }
}
}
Note: the above just prints the paths of the files to be deleted; replace $file.FullName
with $file.Delete()
for actual removal.
source to share
Many of the PowerShell cmdlets are slower than their .NET equivalents. For example, you can make a call [System.IO.File]::Delete($_.FullName)
and see if there is a performance difference. The same goes for Get-ChildItem
=> [System.IO.Directory]::GetFiles(...)
.
To do this, I will write a small script that creates two temporary folders with, say, 100,000 empty test files in each. Then call each version of the function wrapped in [System.Diagnostics.StopWatch]
.
Sample code:
$stopwatch = New-Object 'System.Diagnostics.StopWatch'
$stopwatch.Start()
Remove-OldItems1 ...
$stopwatch.Stop()
Write-Host $stopwatch.ElapsedMilliseconds
$stopwatch.Reset()
$stopwatch.Start()
Remove-OldItems2 ...
$stopwatch.Stop()
Write-Host $stopwatch.ElapsedMilliseconds
Additional brown dots for PowerShell: Run Get-Verb
in a Powershell window and you will see a list of approved verbs. Functions in PowerShell are supposed to be named Verb-Noun
, so something seems Remove-OldItems
to fit the bill.
source to share
This will delete everything in parallel.
workflow delOld([string]$dir, [int]$numDays){
$timespan = new-timespan -days $numDays
$curTime = get-date
$Files = get-childItem $dir -Recurse -file | where-object {(($curTime)-($_.LastWriteTime)) -gt $timespan}
foreach -parallel ($file in $files){
Remove-Item $File
}
}
delOld -dir "C:\Users\AndrewD\Downloads" -numDays 8
Now if its many folders try this
source to share