FileStream Seek not working on large files on second call

I work with large files starting at 10Gb. I load parts of a file into memory for processing. The following code works fine for smaller files (700Mb)

 byte[] byteArr = new byte[layerPixelCount];
 using (FileStream fs = File.OpenRead(recFileName))
    {
        using (BinaryReader br = new BinaryReader(fs))
        {
            fs.Seek(offset, SeekOrigin.Begin);

            for (int i = 0; i < byteArr.Length; i++)
            {
                byteArr[i] = (byte)(br.ReadUInt16() / 256);
            }
         }
    }

      

After opening the 10Gb file, the first run of this function is fine. But the second one Seek()

throws an exception IO

:

An attempt was made to move the file pointer before the beginning of the file.

      

Numbers:

fs.Length = 11998628352

offset = 4252580352

byteArr.Length = 7746048

I assumed the GC was not collecting the closed link fs

before the second call and tried

    GC.Collect();
    GC.WaitForPendingFinalizers();

      

but no luck.

Any help would be appreciated

+3


source to share


2 answers


I am assuming that either your signed integer index is offset

jumping to negative values. Try to announce offset

and i

as long as possible.



//Offest is now long
long offset = 4252580352;

byte[] byteArr = new byte[layerPixelCount];
using (FileStream fs = File.OpenRead(recFileName))
{
   using (BinaryReader br = new BinaryReader(fs))
    {
        fs.Seek(offset, SeekOrigin.Begin);

        for (long i = 0; i < byteArr.Length; i++)
        {
            byteArr[i] = (byte)(br.ReadUInt16() / 256);
        }
    }
}

      

+4


source


My following logic of the written code is suitable for large files beyond 4 GB. The key issue to look out for is the LONG data type used with the SEEK method. Because LONG is capable of pointing beyond 2 ^ 32 data boundaries. In this example, the code processes a large file in 1 GB chunks first, after processing large 1 GB chunks, it processes the remaining (<1 GB) bytes. I am using this code to calculate CRC for files that are larger than 4GB. (using https://crc32c.machinezoo.com/ to calculate crc32c in this example)



private uint Crc32CAlgorithmBigCrc(string fileName)
{
    uint hash = 0;
    byte[] buffer = null;
    FileInfo fileInfo = new FileInfo(fileName);
    long fileLength = fileInfo.Length;
    int blockSize = 1024000000;
    decimal div = fileLength / blockSize;
    int blocks = (int)Math.Floor(div);
    int restBytes = (int)(fileLength - (blocks * blockSize));
    long offsetFile = 0;
    uint interHash = 0;
    Crc32CAlgorithm Crc32CAlgorithm = new Crc32CAlgorithm();
    bool firstBlock = true;
    using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
    {
        buffer = new byte[blockSize];
        using (BinaryReader br = new BinaryReader(fs))
        {
            while (blocks > 0)
            {
                blocks -= 1;
                fs.Seek(offsetFile, SeekOrigin.Begin);
                buffer = br.ReadBytes(blockSize);
                if (firstBlock)
                {
                    firstBlock = false;
                    interHash = Crc32CAlgorithm.Compute(buffer);
                    hash = interHash;
                }
                else
                {
                    hash = Crc32CAlgorithm.Append(interHash, buffer);
                }
                offsetFile += blockSize;
            }
            if (restBytes > 0)
            {
                Array.Resize(ref buffer, restBytes);
                fs.Seek(offsetFile, SeekOrigin.Begin);
                buffer = br.ReadBytes(restBytes);
                hash = Crc32CAlgorithm.Append(interHash, buffer);
            }
            buffer = null;
        }
    }
    //MessageBox.Show(hash.ToString());
    //MessageBox.Show(hash.ToString("X"));
    return hash;
}

      

0


source







All Articles