How to use Directory.EnumerateFiles
msdn ( https://msdn.microsoft.com/en-us/library/dd383458(v=vs.110).aspx ):
The EnumerateFiles and GetFiles methods differ in the following way: when you use EnumerateFiles, you can start enumerating the collection of names before returning the entire collection; when you use GetFiles you have to wait for the entire array of names to return before you can access the array. Therefore, when you are working with many files and directories, EnumerateFiles can be more efficient.
How can I start using a collection before returning the entire collection?
The following code gives an elapsed time of over 3 minutes for a directory with approximately 45000 files
Dim TIme1, TIme2 As String
TIme1 = TimeString
Dim DirFiles As Generic.List(Of String) = New Generic.List(Of String)(Directory.EnumerateFiles(SourceDirectory))
Dim NumberOfFiles As Integer
NumberOfFiles = DirFiles.Count()
TIme2 = TimeString
MsgBox("Begin time " & TIme1 & "There are " & NumberOfFiles & " Photos in the Directory ." & SourceDirectory & "End Time " & TIme2)
Can I use the entries in Dirfiles before the entire collection is read? How?
I used to be a professional programmer before Microsoft launched Windows. My experience with windows programming is minimal.
source to share
While you can't make good use of the number of files being returned EnumerateFiles
, you can get started with the individual files in the collection without any delays with the loop For Each
etc. that don't need the number of items to make it work.
So, for example, you can do:
Dim FileCount As Integer
Dim files = Directory.EnumerateFiles(srcDir)
For Each file in files
'Do something with this file
' e.g.
TextBox1.AppendText(file & vbCrLf)
FileCount += 1
Next
MsgBox ( FileCount.ToString & " files processed.")
So, can you see how you can use it?
[NB: dialing fingerprint ... contains typos. It is intended only to explain the concept.]
source to share
The GetFiles method implements the entire list of files in the directory. The preferred calling method is now Directory.EnumerateFiles as it will transfer files back (via the yield mechanism) as the main OS call returns results.
Solutions using GetFiles / GetDirectories are slow because the objects have to be created. Using an enum, on the other hand, doesn't do this, it doesn't create any temporary objects.
Anyway, at the end the iteration is still going on ...
Example of the number of files ...
Directory.EnumerateFiles(directory, filetype, SearchOption.AllDirectories).Count()
source to share
Now I am using the following command before running enumeratefiles
Public Function FileCount(PathName As String) As Long
Dim fso As Scripting.FileSystemObject
Dim fld As Scripting.Folder
fso = CreateObject("Scripting.FileSystemObject")
If fso.FolderExists(PathName) Then
fld = fso.GetFolder(PathName)
FileCount = fld.Files.Count
End If
End Function
This requires Microsoft Scripting Runtime (set the link to the VB script runtime library in your project)
source to share
Signature for GetFiles
equals Directory.GetFiles(path As String) As String()
. In order for it to return results, it must hit the hard drive and collect the entire array first. If there are 45,000 files, then it must build an array of 45,000 elements before it can give you a result.
Signature for EnumerateFiles
equals Directory.EnumerateFiles(path As String) As IEnumerable(Of String)
. In this case, you don't have to hit the hard drive at all to give you an answer. Thus, you will be able to get the result almost instantly, regardless of the number of files.
Take this test code:
Dim sw = Stopwatch.StartNew()
Dim files = Directory.GetFiles("C:\Windows\System32")
sw.Stop()
Console.WriteLine(sw.Elapsed.TotalMilliseconds)
I get a result of about 6.5 milliseconds to get the files back.
But if I change GetFiles
to EnumerateFiles
, I get the result in 0.07 milliseconds. It is almost 100 times slower to call GetFiles
for this folder!
This is because it EnumerateFiles
returns IEnumerable<string>
. Interface for IEnumerable(Of T)
:
Public Interface IEnumerable(Of Out T)
Inherits IEnumerable
Function GetEnumerator() As IEnumerator(Of T)
End Interface
Whenever we call foreach
either .Count()
or .ToArray()
in an enum under the hood, we call GetEnumerator()
, which in turn returns another type object IEnumerator(Of T)
with this signature:
Public Interface IEnumerator(Of Out T)
Inherits IDisposable
Inherits IEnumerator
ReadOnly Property Current As T
Function MoveNext() As Boolean
Sub Reset()
End Interface
It's an enumerator that actually does the hard work of returning all files. As soon as the first call MoveNext
is made, the first filename will be immediately available in Current
. It MoveNext
is then called in a loop until it returns false
and you know the loop is complete. Meanwhile, you can collect all files from the property Current
.
So in your code, if you were performing some kind of action on every file returned, then EnumerateFiles
that would be the way to go.
But since you do New Generic.List(Of String)(Directory.EnumerateFiles(SourceDirectory))
, you are immediately forced to iterate over the entire enum. Any advantage of use is EnumerateFiles
immediately lost.
source to share