Modifying or deleting a line from a text file in a low level way?
I am working with a text file in Delphi and I don't want to use the load / save method on a list of strings. I intend to maintain an open stream where I read and write my data there, keeping a huge amount of data on the hard drive and not in memory. I have a simple concept of writing new lines in a text file and reading them, but when it comes to changing and deleting them, I cannot find any good resources.
Each line in this file contains a name and an equal sign, and the rest contains data. Eg SOMEUNIQUENAME=SomeStringValue
. I intend to keep the file open for a period of time within the stream. This thread executes incoming requests to get, set, or delete specific data fields. I use WriteLn
and ReadLn
in a loop while evaluating EOF
. Below is an example of how I read the data:
FFile = TextFile;
...
function TFileWrapper.ReadData(const Name: String): String;
var
S: String; //Temporary line to be parsed
N: String; //Temporary name of field
begin
Result:= '';
Reset(FFile);
while not EOF(FFile) do begin
ReadLn(FFile, S);
N:= UpperCase(Copy(S, 1, Pos('=', S)-1));
if N = UpperCase(Name) then begin
Delete(S, 1, Pos('=', S));
Result:= S;
Break;
end;
end;
end;
... and then I fire an event that informs the sender of the result. The requests are in the queue, which is a kind of message pump for these requests. The thread simply processes the next request in the queue multiple times, similar to how typical applications work.
I have procedures ready to write and delete these fields, but I don't know what I need to do to actually perform an action on the file.
procedure TFileWrapper.WriteData(const Name, Value: String);
var
S: String; //Temporary line to be parsed
N: String; //Temporary name of field
begin
Result:= '';
Reset(FFile);
while not EOF(FFile) do begin
ReadLn(FFile, S);
N:= UpperCase(Copy(S, 1, Pos('=', S)-1));
if N = UpperCase(Name) then begin
//How to re-write this line?
Break;
end;
end;
end;
procedure TFileWrapper.DeleteData(const Name: String);
var
S: String; //Temporary line to be parsed
N: String; //Temporary name of field
begin
Result:= '';
Reset(FFile);
while not EOF(FFile) do begin
ReadLn(FFile, S);
N:= UpperCase(Copy(S, 1, Pos('=', S)-1));
if N = UpperCase(Name) then begin
//How to delete this line?
Break;
end;
end;
end;
In the end, I don't need to load the entire file into memory in order to accomplish this.
I find this question interesting, so I made a small console application.
I used 3 methods:
- TStringList
- StreamReader / StreamWriter
- Text file
All methods are synchronized and repeated 100 times with a 10KB text file and a 1MB text file. Here's the program:
program Project16;
{$APPTYPE CONSOLE}
uses
SysUtils, Classes, StrUtils, Diagnostics, IOUtils;
procedure DeleteLine(StrList: TStringList; SearchPattern: String);
var
Index : Integer;
begin
for Index := 0 to StrList.Count-1 do
begin
if ContainsText(StrList[Index], SearchPattern) then
begin
StrList.Delete(Index);
Break;
end;
end;
end;
procedure DeleteLineWithStringList(Filename : string; SearchPattern : String);
var StrList : TStringList;
begin
StrList := TStringList.Create;
try
StrList.LoadFromFile(Filename);
DeleteLine(StrList, SearchPattern);
// don't overwrite our input file so we can test
StrList.SaveToFile(TPath.ChangeExtension(Filename, '.new'));
finally
StrList.Free;
end;
end;
procedure DeleteLineWithStreamReaderAndWriter(Filename : string; SearchPattern : String);
var
Reader : TStreamReader;
Writer : TStreamWriter;
Line : String;
DoSearch : Boolean;
DoWrite : Boolean;
begin
Reader := TStreamReader.Create(Filename);
Writer := TStreamWriter.Create(TPath.ChangeExtension(Filename, '.new'));
try
DoSearch := True;
DoWrite := True;
while Reader.Peek >= 0 do
begin
Line := Reader.ReadLine;
if DoSearch then
begin
DoSearch := not ContainsText(Line, SearchPattern);
DoWrite := DoSearch;
end;
if DoWrite then
Writer.WriteLine(Line)
else
DoWrite := True;
end;
finally
Reader.Free;
Writer.Free;
end;
end;
procedure DeleteLineWithTextFile(Filename : string; SearchPattern : String);
var
InFile : TextFile;
OutFile : TextFile;
Line : String;
DoSearch : Boolean;
DoWrite : Boolean;
begin
AssignFile(InFile, Filename);
AssignFile(OutFile, TPath.ChangeExtension(Filename, '.new'));
Reset(InFile);
Rewrite(OutFile);
try
DoSearch := True;
DoWrite := True;
while not EOF(InFile) do
begin
Readln(InFile, Line);
if DoSearch then
begin
DoSearch := not ContainsText(Line, SearchPattern);
DoWrite := DoSearch;
end;
if DoWrite then
Writeln(OutFile, Line)
else
DoWrite := True;
end;
finally
CloseFile(InFile);
CloseFile(OutFile);
end;
end;
procedure TimeDeleteLineWithStreamReaderAndWriter(Iterations : Integer);
var
Count : Integer;
Sw : TStopWatch;
begin
Writeln(Format('Delete line with stream reader/writer - file 10kb, %d iterations', [Iterations]));
Sw := TStopwatch.StartNew;
for Count := 1 to Iterations do
DeleteLineWithStreamReaderAndWriter('c:\temp\text10kb.txt', 'thislinewillbedeleted=');
Sw.Stop;
Writeln(Format('Elapsed time : %d milliseconds', [Sw.ElapsedMilliseconds]));
Writeln(Format('Delete line with stream reader/writer - file 1Mb, %d iterations', [Iterations]));
Sw := TStopwatch.StartNew;
for Count := 1 to Iterations do
DeleteLineWithStreamReaderAndWriter('c:\temp\text1Mb.txt', 'thislinewillbedeleted=');
Sw.Stop;
Writeln(Format('Elapsed time : %d milliseconds', [Sw.ElapsedMilliseconds]));
end;
procedure TimeDeleteLineWithStringList(Iterations : Integer);
var
Count : Integer;
Sw : TStopWatch;
begin
Writeln(Format('Delete line with TStringlist - file 10kb, %d iterations', [Iterations]));
Sw := TStopwatch.StartNew;
for Count := 1 to Iterations do
DeleteLineWithStringList('c:\temp\text10kb.txt', 'thislinewillbedeleted=');
Sw.Stop;
Writeln(Format('Elapsed time : %d milliseconds', [Sw.ElapsedMilliseconds]));
Writeln(Format('Delete line with TStringlist - file 1Mb, %d iterations', [Iterations]));
Sw := TStopwatch.StartNew;
for Count := 1 to Iterations do
DeleteLineWithStringList('c:\temp\text1Mb.txt', 'thislinewillbedeleted=');
Sw.Stop;
Writeln(Format('Elapsed time : %d milliseconds', [Sw.ElapsedMilliseconds]));
end;
procedure TimeDeleteLineWithTextFile(Iterations : Integer);
var
Count : Integer;
Sw : TStopWatch;
begin
Writeln(Format('Delete line with text file - file 10kb, %d iterations', [Iterations]));
Sw := TStopwatch.StartNew;
for Count := 1 to Iterations do
DeleteLineWithTextFile('c:\temp\text10kb.txt', 'thislinewillbedeleted=');
Sw.Stop;
Writeln(Format('Elapsed time : %d milliseconds', [Sw.ElapsedMilliseconds]));
Writeln(Format('Delete line with text file - file 1Mb, %d iterations', [Iterations]));
Sw := TStopwatch.StartNew;
for Count := 1 to Iterations do
DeleteLineWithTextFile('c:\temp\text1Mb.txt', 'thislinewillbedeleted=');
Sw.Stop;
Writeln(Format('Elapsed time : %d milliseconds', [Sw.ElapsedMilliseconds]));
end;
begin
try
TimeDeleteLineWithStringList(100);
TimeDeleteLineWithStreamReaderAndWriter(100);
TimeDeleteLineWithTextFile(100);
Writeln('Press ENTER to quit');
Readln;
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
end.
Output:
Delete line with TStringlist - file 10kb, 100 iterations
Elapsed time : 188 milliseconds
Delete line with TStringlist - file 1Mb, 100 iterations
Elapsed time : 5137 milliseconds
Delete line with stream reader/writer - file 10kb, 100 iterations
Elapsed time : 456 milliseconds
Delete line with stream reader/writer - file 1Mb, 100 iterations
Elapsed time : 22382 milliseconds
Delete line with text file - file 10kb, 100 iterations
Elapsed time : 250 milliseconds
Delete line with text file - file 1Mb, 100 iterations
Elapsed time : 9656 milliseconds
Press ENTER to quit
As you can see, TStringList is the winner here. Since you cannot use TStringList, TextFile is not a bad choice after all ...
PS: this code skips the part where you have to delete the input file and rename the output file to the original filename
Without loading the entire file into a container, for example TStringList
, your only option is:
- Open file for input
- Open a separate copy for output
- Start cycle
- Read content line by line from input file
- Write content line by line to the output file until you reach the line you want to change / delete.
- Breaking the cycle
- Read input line from input file
- Write the modified line (or skip writing the line you want to delete) to the output file
- Start a new cycle
- Read the rest of the input content, line by line
- Write the rest of this input to the output file, line by line
- Breaking the cycle
- Close files
So, to answer your specific questions:
if N = UpperCase(Name) then begin
//How to re-write this line?
Break;
end;
WriteLn new output to second (output) file.
if N = UpperCase(Name) then begin
//How to delete this line?
Break;
end;
Just pass WriteLn
, which outputs the specified line to the second (output) file.
Your artificial constraint "I don't want to use TStringList" just makes things harder for you when you can simply:
- Load the source file in
TStringList
withLoadFromFile
- Find the line you want to change, either by index, iteration, or
IndexOf()
- Change the line by changing it directly or by removing it from
TStringList
- Write all content to your original file with
TStringList.SaveToFile
The only reasons I didn't use TStringList
to do these operations were because the file is larger than capacity TStringList
(never was) or when it deals with a file that is text but is not really "line oriented" (eg , EDI files, which are usually one long single line of text, or XML files that cannot contain linear pipes and are therefore also one very long, single line of text). However, even in the case of EDI or XML, it is quite common to load them in TStringList
, do the conversion to linear format (insert line breaks or whatever) and search from the string list.
Basically, you can't do what you want if you treat the files as plain text files. Such files can be read (only from the beginning) or written (either from the very beginning, thus creating a new file), or from the end (appending to an existing file). They are not random access files.
On the other hand, you may want to define a file of type string: each entry in the file will be a string, and you can access that file randomly. Then there is the problem of knowing which entry to access which row.
The third possibility is using INI files, which are more structured and sound like the best bet for your purposes. Besides the section title, they are a series of strings, key = value, and can be accessed based on the key.