Setting timestamps on files / directories is extremely slow

I am working on a project that requires copying a lot of files and directories while keeping their original timestamps. Therefore I need to make many calls to the target methods SetCreationTime()

, SetLastWriteTime()

and SetLastAccessTime()

to copy the original values ​​from the source to the target. As shown in the screenshot below, these simple operations take up to 42% of the total computation time.

performance analysis

Since this limits my application performance enormously, I would like to speed up the process. I am assuming that each of these calls requires opening and closing a new stream in the file / directory. If this is the reason, I would like to keep this stream open until I have finished writing all the attributes. How to do it? I guess this will require using some P / Invoke.

Update:

I used Lucas's advice to use the WinAPI method CreateFile(..)

with FILE_WRITE_ATTRIBUTES

. To P / Call the method mentioned, I created the following wrapper:

public class Win32ApiWrapper
{
    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Auto)]
    private static extern SafeFileHandle CreateFile(string lpFileName,
                                                    [MarshalAs(UnmanagedType.U4)] FileAccess dwDesiredAccess,
                                                    [MarshalAs(UnmanagedType.U4)] FileShare dwShareMode,
                                                    IntPtr lpSecurityAttributes, 
                                                    [MarshalAs(UnmanagedType.U4)] FileMode dwCreationDisposition,
                                                    [MarshalAs(UnmanagedType.U4)] FileAttributes dwFlagsAndAttributes,
                                                    IntPtr hTemplateFile);

    public static SafeFileHandle CreateFileGetHandle(string path, int fileAttributes)
    {
        return CreateFile(path,
                (FileAccess)(EFileAccess.FILE_WRITE_ATTRIBUTES | EFileAccess.FILE_WRITE_DATA),
                0,
                IntPtr.Zero,
                FileMode.Create,
                (FileAttributes)fileAttributes,
                IntPtr.Zero);
        }
}

      

The enumerations listed here can be found here . This allowed me to do everything just by opening the file once: create the file, apply all attributes, set timestamps, and copy the actual content from the original file.

FileInfo targetFile;
int fileAttributes;
IDictionary<string, long> timeStamps; 

using (var hFile = Win32ApiWrapper.CreateFileGetHandle(targetFile.FullName, attributeFlags))
using (var targetStream = new FileStream(hFile, FileAccess.Write))
{
    // copy file
    Win32ApiWrapper.SetFileTime(hFile, timeStamps);
}

      

Was it worth the effort? YES. This reduced the computation time by ~ 40% from 86 to 51.

Results before optimization:

before

Results after optimization:

after

+3


source to share


1 answer


I am not a C # programmer and I have no idea how these System.IO.FileSystemInfo methods are implemented. But I ran some tests using the WIN32 API function SetFileTime (..) , which will be called by C # at some point.

Here is a snippet of my test loop:

#define NO_OF_ITERATIONS   100000

int iteration;
DWORD tStart;
SYSTEMTIME tSys;
FILETIME tFile;
HANDLE hFile;
DWORD tEllapsed;


iteration = NO_OF_ITERATIONS;
GetLocalTime(&tSys);
tStart = GetTickCount();
while (iteration)
{
   tSys.wYear++;
   if (tSys.wYear > 2020)
   {
      tSys.wYear = 2000;
   }

   SystemTimeToFileTime(&tSys, &tFile);
   hFile = CreateFile("test.dat",
                      GENERIC_WRITE,   // FILE_WRITE_ATTRIBUTES
                      0,
                      NULL,
                      OPEN_EXISTING,
                      FILE_ATTRIBUTE_NORMAL,
                      NULL);
   if (hFile == INVALID_HANDLE_VALUE)
   {
      printf("CreateFile(..) failed (error: %d)\n", GetLastError());
      break;
   }

   SetFileTime(hFile, &tFile, &tFile, &tFile);

   CloseHandle(hFile);
   iteration--;
}
tEllapsed = GetTickCount() - tStart;

      

I've seen that the expensive part of setting the file time is opening / closing the file. About 60% of the time is used to open the file and about 40% to close it (which needs to be cleaned up with changes on disk). The above loop took about 9s for 10,000 iterations.

A little research has shown that calling CreateFile(..)

with FILE_WRITE_ATTRIBUTES

(instead of GENERIC_WRITE

) is enough to change the file's time attributes.



This modification speeds up the process a lot! Now the same loop ends within 2 seconds for 10,000 iterations. Since the number of iterations is quite small, I did a second run with 100,000 iterations to get a more reliable time measurement:

  • FILE_WRITE_ATTRIBUTES: 5 runs with 100,000 iterations: 12.7-13.2s
  • GENERIC_WRITE: 5 works with 100000 iterations: 63.2-72.5s

Based on the numbers above, I am guessing that C # methods are using the wrong access mode when opening a file to change the time of the file. Or some other C # behavior is slowing things down ...

So maybe the solution to your speed is to implement a DLL that exports a C function that changes the file time with SetFileTime(..)

? Or can you even import functions CreateFile(..)

, SetFileTime(..)

and CloseHandle(..)

to avoid calling C # methods?

Good luck!

+5


source







All Articles