Remove the last x lines from the stream

I need to read everything except the last x lines from a file into a streamreader in C #. What's the best way to do this?

Many thanks!

+3


source to share


3 answers


If it's a large file, is it possible to just find the end of the file and parse the bytes in reverse order for the '\ n' character? I know there are \ n and \ r \ n. I hacked into the following code and tested it against a rather trivial file. Can you try to test this against the files you have? I know my solution looks like a long time, but I think you will find that it is faster than reading from the beginning and rewriting the entire file.

public static void Truncate(string file, int lines)
{
    using (FileStream fs = File.Open(file, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None))
    {
        fs.Position = fs.Length;

        // \n \r\n (both uses \n for lines)
        const int BUFFER_SIZE = 2048;

        // Start at the end until # lines have been encountered, record the position, then truncate the file
        long currentPosition = fs.Position;
        int linesProcessed = 0;

        byte[] buffer = new byte[BUFFER_SIZE];
        while (linesProcessed < linesToTruncate && currentPosition > 0)
        {
            int bytesRead = FillBuffer(buffer, fs);

            // We now have a buffer containing the later contents of the file
            for (int i = bytesRead - 1; i >= 0; i--)
            {
                 currentPosition--;
                 if (buffer[i] == '\n')
                 {
                     linesProcessed++;
                     if (linesProcessed == linesToTruncate)
                         break;
                 }
            }
        }

        // Truncate the file
        fs.SetLength(currentPosition);
    }
}

private static int FillBuffer(byte[] buffer, FileStream fs)
{
    if (fs.Position == 0)
        return 0;

    int bytesRead = 0;
    int currentByteOffset = 0;

    // Calculate how many bytes of the buffer can be filled (remember that we're going in reverse)
    long expectedBytesToRead = (fs.Position < buffer.Length) ? fs.Position : buffer.Length;
    fs.Position -= expectedBytesToRead;

    while (bytesRead < expectedBytesToRead)
    {
        bytesRead += fs.Read(buffer, currentByteOffset, buffer.Length - bytesRead);
        currentByteOffset += bytesRead;
    }

    // We have to reset the position again because we moved the reader forward;
    fs.Position -= bytesRead;
    return bytesRead;
}

      



Since you only plan to delete the end of the file, it seems wasteful to rewrite everything, especially if it is a large file and small N. Of course, one could argue that if someone wants to destroy all lines, then going from beginning to end is more efficient.

+3


source


You are not really reading INTO StreamReader. In fact, you don't need a StreamReader at all for the template you are asking for. System.IO.File has a useful static ReadLines method that you can use instead:

IEnumerable<string> allBut = File.ReadLines(path).Reverse().Skip(5).Reverse();

      



Previous incorrect version, back in response to comment thread

List<string> allLines = File.ReadLines(path).ToList();
IEnumerable<string> allBut = allLines.Take(allLines.Count - 5);

      

+3


source


Since you are referring to lines in the file, I will assume it is a text file. If you just want to get strings, you can read them as an array of strings like this:

string[] lines = File.ReadAllLines(@"C:\test.txt");

      

Or if you really need to work with StreamReaders:

using (StreamReader reader = new StreamReader(@"C:\test.txt"))
        {
            while (!reader.EndOfStream)
            {
                Console.WriteLine(reader.ReadLine());
            }
        }

      

+3


source







All Articles