How to use Directory.EnumerateFiles

msdn ( https://msdn.microsoft.com/en-us/library/dd383458(v=vs.110).aspx ):

The EnumerateFiles and GetFiles methods differ in the following way: when you use EnumerateFiles, you can start enumerating the collection of names before returning the entire collection; when you use GetFiles you have to wait for the entire array of names to return before you can access the array. Therefore, when you are working with many files and directories, EnumerateFiles can be more efficient.

How can I start using a collection before returning the entire collection?

The following code gives an elapsed time of over 3 minutes for a directory with approximately 45000 files

Dim TIme1, TIme2 As String
TIme1 = TimeString
Dim DirFiles As Generic.List(Of String) = New Generic.List(Of String)(Directory.EnumerateFiles(SourceDirectory))
Dim NumberOfFiles As Integer
NumberOfFiles = DirFiles.Count()
TIme2 = TimeString
MsgBox("Begin time " & TIme1 & "There are " & NumberOfFiles & " Photos in the Directory ." & SourceDirectory & "End Time " & TIme2)

      

Can I use the entries in Dirfiles before the entire collection is read? How?

I used to be a professional programmer before Microsoft launched Windows. My experience with windows programming is minimal.

+3


source to share


5 answers


While you can't make good use of the number of files being returned EnumerateFiles

, you can get started with the individual files in the collection without any delays with the loop For Each

etc. that don't need the number of items to make it work.

So, for example, you can do:

Dim FileCount As Integer
Dim files = Directory.EnumerateFiles(srcDir)
For Each file in files
    'Do something with this file
    ' e.g.
    TextBox1.AppendText(file & vbCrLf)
    FileCount += 1
Next
MsgBox ( FileCount.ToString & " files processed.")

      



So, can you see how you can use it?

[NB: dialing fingerprint ... contains typos. It is intended only to explain the concept.]

+2


source


EnumerateFiles

allows you to run processing files before all files are found. It sounds like you want to know the number of files. You cannot know that until all files are found, so EnumerateFiles

it won't help you in that case.



+1


source


The GetFiles method implements the entire list of files in the directory. The preferred calling method is now Directory.EnumerateFiles as it will transfer files back (via the yield mechanism) as the main OS call returns results.

Solutions using GetFiles / GetDirectories are slow because the objects have to be created. Using an enum, on the other hand, doesn't do this, it doesn't create any temporary objects.

Anyway, at the end the iteration is still going on ...

Example of the number of files ...

 Directory.EnumerateFiles(directory, filetype, SearchOption.AllDirectories).Count()

      

0


source


Now I am using the following command before running enumeratefiles

Public Function FileCount(PathName As String) As Long
    Dim fso As Scripting.FileSystemObject
    Dim fld As Scripting.Folder
    fso = CreateObject("Scripting.FileSystemObject")
    If fso.FolderExists(PathName) Then
        fld = fso.GetFolder(PathName)
        FileCount = fld.Files.Count
    End If
End Function

      

This requires Microsoft Scripting Runtime (set the link to the VB script runtime library in your project)

0


source


Signature for GetFiles

equals Directory.GetFiles(path As String) As String()

. In order for it to return results, it must hit the hard drive and collect the entire array first. If there are 45,000 files, then it must build an array of 45,000 elements before it can give you a result.

Signature for EnumerateFiles

equals Directory.EnumerateFiles(path As String) As IEnumerable(Of String)

. In this case, you don't have to hit the hard drive at all to give you an answer. Thus, you will be able to get the result almost instantly, regardless of the number of files.

Take this test code:

Dim sw = Stopwatch.StartNew()
Dim files = Directory.GetFiles("C:\Windows\System32")
sw.Stop()
Console.WriteLine(sw.Elapsed.TotalMilliseconds)

      

I get a result of about 6.5 milliseconds to get the files back.

But if I change GetFiles

to EnumerateFiles

, I get the result in 0.07 milliseconds. It is almost 100 times slower to call GetFiles

for this folder!

This is because it EnumerateFiles

returns IEnumerable<string>

. Interface for IEnumerable(Of T)

:

Public Interface IEnumerable(Of Out T)
    Inherits IEnumerable
    Function GetEnumerator() As IEnumerator(Of T)
End Interface

      

Whenever we call foreach

either .Count()

or .ToArray()

in an enum under the hood, we call GetEnumerator()

, which in turn returns another type object IEnumerator(Of T)

with this signature:

Public Interface IEnumerator(Of Out T)
    Inherits IDisposable
    Inherits IEnumerator
    ReadOnly Property Current As T
    Function MoveNext() As Boolean
    Sub Reset()
End Interface

      

It's an enumerator that actually does the hard work of returning all files. As soon as the first call MoveNext

is made, the first filename will be immediately available in Current

. It MoveNext

is then called in a loop until it returns false

and you know the loop is complete. Meanwhile, you can collect all files from the property Current

.

So in your code, if you were performing some kind of action on every file returned, then EnumerateFiles

that would be the way to go.

But since you do New Generic.List(Of String)(Directory.EnumerateFiles(SourceDirectory))

, you are immediately forced to iterate over the entire enum. Any advantage of use is EnumerateFiles

immediately lost.

0


source







All Articles