What is an efficient way in C # to do MD5 and load everything at once?

I am working on a download and then MD5 checks to ensure a successful download. I have the following code that should work, but not the most efficient - especially for large files.

        using (var client = new System.Net.WebClient())
        {
            client.DownloadFile(url, destinationFile);
        }

        var fileHash = GetMD5HashAsStringFromFile(destinationFile);
        var successful = expectedHash.Equals(fileHash, StringComparison.OrdinalIgnoreCase);

      

I'm worried that all bytes are being transferred to disk and then MD5 ComputeHash()

has to open the file and read all bytes again. Is there a good, clean way to compute MD5 as part of the download stream? Ideally, MD5 should just fall out of function DownloadFile()

as a side effect of the sort. A function with this signature:

string DownloadFileAndComputeHash(string url, string filename, HashTypeEnum hashType);

      


Edit: Adds code forGetMD5HashAsStringFromFile()

    public string GetMD5HashAsStringFromFile(string filename)
    {
        using (FileStream file = File.Open(filename, FileMode.Open, FileAccess.Read, FileShare.Read))
        {
            var md5er = System.Security.Cryptography.MD5.Create();
            var md5HashBytes = md5er.ComputeHash(file);
            return BitConverter
                    .ToString(md5HashBytes)
                    .Replace("-", string.Empty)
                    .ToLower();
        }
    }

      

+3


source to share


3 answers


Is there a good, clean way to compute MD5 as part of the download stream? Ideally, MD5 should just fall out of function DownloadFile()

as a side effect of the sort.

You could follow this strategy, do "chunked" calculations, and minimize memory pressure (and duplication):

  • Open the response flow in the web client.
  • Open the target file.
  • Repeat if available:
    • Reading a fragment from the response stream into a byte buffer
    • Write it to the stream of the final file.
    • Use a method TransformBlock

      to add bytes to the hash calculation
  • Use TransformFinalBlock

    to get the computed hash code.


The example code below shows how this can be achieved.

public static byte[] DownloadAndGetHash(Uri file, string destFilePath, int bufferSize)
{
    using (var md5 = MD5.Create())
    using (var client = new System.Net.WebClient())
    {
        using (var src = client.OpenRead(file))
        using (var dest = File.Create(destFilePath, bufferSize))
        {
            md5.Initialize();
            var buffer = new byte[bufferSize];
            while (true)
            {
                var read = src.Read(buffer, 0, buffer.Length);
                if (read > 0)
                {
                    dest.Write(buffer, 0, read);
                    md5.TransformBlock(buffer, 0, read, null, 0);
                }
                else // reached the end.
                {
                    md5.TransformFinalBlock(buffer, 0, 0);
                    return md5.Hash;
                }
            }
        }
    }
}

      

+7


source


If you are talking about large files (I am assuming more than 1 GB), you will want to read the data in chunks, then process each chunk using the MD5 algorithm, and then save it to disk. It's doable, but I don't know how many of the default .NET classes will help you with this.

One approach could be with a custom stream wrapper. First you get Stream

from the WebClient (through GetWebResponse()

and then GetResponseStream()

), then you wrap it up and then pass it ComputeHash(stream)

. When MD5 calls Read()

on your wrapper, the wrapper will call Read

on the network stream, write the data after receiving it, and then pipe it back to MD5.



I don't know what problems await you if you try to do this.

+1


source


Something like that.

byte[] result;
using (var webClient = new System.Net.WebClient())
{
    result = webClient.DownloadData("http://some.url");
}

byte[] hash = ((HashAlgorithm)CryptoConfig.CreateFromName("MD5")).ComputeHash(result);

      

0


source







All Articles