FileStream Seek not working on large files on second call
I work with large files starting at 10Gb. I load parts of a file into memory for processing. The following code works fine for smaller files (700Mb)
byte[] byteArr = new byte[layerPixelCount];
using (FileStream fs = File.OpenRead(recFileName))
{
using (BinaryReader br = new BinaryReader(fs))
{
fs.Seek(offset, SeekOrigin.Begin);
for (int i = 0; i < byteArr.Length; i++)
{
byteArr[i] = (byte)(br.ReadUInt16() / 256);
}
}
}
After opening the 10Gb file, the first run of this function is fine. But the second one Seek()
throws an exception IO
:
An attempt was made to move the file pointer before the beginning of the file.
Numbers:
fs.Length = 11998628352
offset = 4252580352
byteArr.Length = 7746048
I assumed the GC was not collecting the closed link fs
before the second call and tried
GC.Collect();
GC.WaitForPendingFinalizers();
but no luck.
Any help would be appreciated
source to share
I am assuming that either your signed integer index is offset
jumping to negative values. Try to announce offset
and i
as long as possible.
//Offest is now long
long offset = 4252580352;
byte[] byteArr = new byte[layerPixelCount];
using (FileStream fs = File.OpenRead(recFileName))
{
using (BinaryReader br = new BinaryReader(fs))
{
fs.Seek(offset, SeekOrigin.Begin);
for (long i = 0; i < byteArr.Length; i++)
{
byteArr[i] = (byte)(br.ReadUInt16() / 256);
}
}
}
source to share
My following logic of the written code is suitable for large files beyond 4 GB. The key issue to look out for is the LONG data type used with the SEEK method. Because LONG is capable of pointing beyond 2 ^ 32 data boundaries. In this example, the code processes a large file in 1 GB chunks first, after processing large 1 GB chunks, it processes the remaining (<1 GB) bytes. I am using this code to calculate CRC for files that are larger than 4GB. (using https://crc32c.machinezoo.com/ to calculate crc32c in this example)
private uint Crc32CAlgorithmBigCrc(string fileName)
{
uint hash = 0;
byte[] buffer = null;
FileInfo fileInfo = new FileInfo(fileName);
long fileLength = fileInfo.Length;
int blockSize = 1024000000;
decimal div = fileLength / blockSize;
int blocks = (int)Math.Floor(div);
int restBytes = (int)(fileLength - (blocks * blockSize));
long offsetFile = 0;
uint interHash = 0;
Crc32CAlgorithm Crc32CAlgorithm = new Crc32CAlgorithm();
bool firstBlock = true;
using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
{
buffer = new byte[blockSize];
using (BinaryReader br = new BinaryReader(fs))
{
while (blocks > 0)
{
blocks -= 1;
fs.Seek(offsetFile, SeekOrigin.Begin);
buffer = br.ReadBytes(blockSize);
if (firstBlock)
{
firstBlock = false;
interHash = Crc32CAlgorithm.Compute(buffer);
hash = interHash;
}
else
{
hash = Crc32CAlgorithm.Append(interHash, buffer);
}
offsetFile += blockSize;
}
if (restBytes > 0)
{
Array.Resize(ref buffer, restBytes);
fs.Seek(offsetFile, SeekOrigin.Begin);
buffer = br.ReadBytes(restBytes);
hash = Crc32CAlgorithm.Append(interHash, buffer);
}
buffer = null;
}
}
//MessageBox.Show(hash.ToString());
//MessageBox.Show(hash.ToString("X"));
return hash;
}
source to share