Move files only if there is a corresponding file

I have an application that requires two files to process data. A zip file containing the actual data, then a control file that tells you what to do with the specified data.

These files are uploaded via sftp to an intermediate directory. Once the zip file is complete, I need to check and see if the control file is there. They only have a naming prefix (for example 100001_ABCDEF_123456.zip is paired with 100001_ABCDEF_control_file.ctl.

I am trying to find a way to wait for the zip file to finish downloading and then move the files on the fly while maintaining the directory structure, which is important for the next step in processing.

I am currently waiting for the sftp worker to finish and then call robocopy to move everything. I need a more polished approach.

I have tried several things and I get the same results. Uploading files but never moving. For some reason, I just cannot get the comparison to work correctly.

I tried using FileSystemWatcher to search for rename from filepart to zip, but it seems to have missed a few downloads and the function dies for some reason when I get to my foreach to search the control file directory. Below is the FileSystemWatcher event that I call to create and modify. Also below is the setting for the file system.

        watcher.Path = @"C:\Sync\";
        watcher.IncludeSubdirectories = true;
        watcher.EnableRaisingEvents = true;
        watcher.Filter = "*.zip";
        watcher.NotifyFilter = NotifyFilters.Attributes |
                               NotifyFilters.CreationTime |
                               NotifyFilters.FileName |
                               NotifyFilters.LastAccess |
                               NotifyFilters.LastWrite |
                               NotifyFilters.Size |
                               NotifyFilters.Security | 
                               NotifyFilters.CreationTime | 
                               NotifyFilters.DirectoryName;
        watcher.Created += Watcher_Changed;
        watcher.Changed += Watcher_Changed;

 private void Watcher_Changed(object sender, FileSystemEventArgs e)
    {
        var dir = new DirectoryInfo(e.FullPath.Substring(0, e.FullPath.Length - e.Name.Length));
        var files = dir.GetFiles();

        FileInfo zipFile = new FileInfo(e.FullPath);

        foreach (FileInfo file in files)
        {
            MessageBox.Show(file.Extension);
            if (file.Extension == "ctl" && file.Name.StartsWith(e.Name.Substring(0, (e.Name.Length - 14))))
            {
                file.CopyTo(@"C:\inp\");
                zipFile.CopyTo(@"C:\inp\");
            }
        }
    }

      

+3


source to share


2 answers


The class FileSystemWatcher

is notoriously difficult to use correctly because you will get multiple events for the same file being written, moved, or copied as @WillStoltenberg mentioned in his answer as well .

I found it much easier to just set up a task that runs periodically (for example, every 30 seconds). For your problem, you can easily do something like below. Note that a similar implementation using a timer instead Task.Delay

may be preferred.

public class MyPeriodicWatcher 
{
    private readonly string _watchPath;
    private readonly string _searchMask;
    private readonly Func<string, string> _commonPrefixFetcher;
    private readonly Action<FileInfo, FileInfo> _pairProcessor;
    private readonly TimeSpan _checkInterval;
    private readonly CancellationToken _cancelToken;

    public MyPeriodicWatcher(
        string watchPath,
        string searchMask,
        Func<string, string> commonPrefixFetcher,
        Action<FileInfo, FileInfo> pairProcessor,
        TimeSpan checkInterval,
        CancellationToken cancelToken)
    {
        _watchPath = watchPath;
        _searchMask = string.IsNullOrWhiteSpace(searchMask) ? "*.zip" : searchMask;
        _pairProcessor = pairProcessor;
        _commonPrefixFetcher = commonPrefixFetcher;
        _cancelToken = cancelToken;
        _checkInterval = checkInterval;
    }

    public Task Watch()
    {
        while (!_cancelToken.IsCancellationRequested)
        {
            try
            {
                foreach (var file in Directory.EnumerateFiles(_watchPath, _searchMask))
                {
                    var pairPrefix = _commonPrefixFetcher(file);
                    if (!string.IsNullOrWhiteSpace(pairPrefix))
                    {
                        var match = Directory.EnumerateFiles(_watchPath, pairPrefix + "*.ctl").FirstOrDefault();
                        if (!string.IsNullOrEmpty(match) && !_cancelToken.IsCancellationRequested)
                            _pairProcessor(
                                new FileInfo(Path.Combine(_watchPath, file)),
                                new FileInfo(Path.Combine(_watchPath, match)));
                    }
                    if (_cancelToken.IsCancellationRequested)
                        break;
                }
                if (_cancelToken.IsCancellationRequested)
                    break;

                Task.Delay(_checkInterval, _cancelToken).Wait().ConfigureAwait(false);
            }
            catch (OperationCanceledException)
            {
                break;
            }
        }
    }
}

      



You will need to provide it

  • path to monitor
  • search mask for the first file (i.e. * .zip)
  • a function delegate that gets the common filename prefix from the zip filename
  • interval
  • a delegate that will perform the move and receive FileInfo

    for the pair being processed / moved.
  • and a cancellation token to completely cancel monitoring.

In your dedet pairProcessor

, catch the IO exceptions and check for the access violation (which probably means the file hasn't completed yet).

0


source


Watcher_Changed will be called for all kinds of things, and not every time it is called you want to react to it.

The first thing you should do in your event handler is try to open the zipFile exclusively. If you cannot do this, ignore this event and wait for another event. If it is an FTP server, every time a new chunk of data is written to disk, you will receive a changed event. You can also put something in the "retry" queue or use some other mechanism to check if the file is available later. I have the same need for our system and we try every 5 seconds after we notice the first change. Only once, when we can open the file for writing only, will we allow to proceed to the next step.



I would tighten up your guesses about what the filename looks like. You limit your search to * .zip, but don't rely solely on your .zip files existing in that target directory. Confirm that the parsing you are doing with the filename does not produce unexpected values. You can also check that dir.Exists () before calling dir.GetFiles (). This may be an exception.

As for the missing events, see this good buffer overflow answer: FileSystemWatcher InternalBufferOverflow

+1


source







All Articles