String losing data when assigning TStringList
I have this method,
var
s : TStringList;
fVar : string;
begin
s := TStringList.Create;
fVar := ZCompressStr('text');
ShowMessage( IntToStr(length(fVar) * SizeOf(Char)) );
//24
s.text := fVar;
ShowMessage( IntToStr( length(s.text) * SizeOf(Char)) );
//18
end;
ZCompressStr from http://www.base2ti.com/zlib.htm , line 121 changed from {$ ifndef UNICODE} to {$ ifdef UNICODE} for compilation.
Anyway, I can call ZDecompressStr if I use the fVar variable, however, as soon as I move it to the string list or memo, it seems that they lost those 6 bytes of data .... If I try to use ZDecompressStr on s. text var fails with buffer error.
source to share
There is no reason why you would need to change line 121 of ZLibEx.pas; it is true for all versions of Delphi, including Delphi 2009. A symbol UNICODE
should only be defined for Delphi 2009, and when so, the type definitions for RawByteString
, UnicodeString
and UnicodeChar
should all be skipped because they are already internal types in the language.
ZCompressStr
will create a string that can contain non-printable characters, including null bytes. It stores the result in RawByteString
, which Delphi specifically considers.
TStringList
like everything else in Delphi 2009 uses Unicode. The property Text
is of type UnicodeString
. Whenever you assign any value UnicodeString
UnicodeString
, you get the transformation as from an MultiByteToWideStr
API function . Even RawByteString
included in this rule. If RawByteString
you have not assigned a code page-specific string value for a string, it will have code page 0, which is the CP_ACP
default code page for your system.
If the string does not actually contain characters encoded according to the system code page, then any conversion is asking for problems: garbage collection, garbage collection. In particular, there is no guarantee that you will receive the same number of characters.
As mentioned in Smok1 , a property TStringList.Text
is a property. It has a setter method that breaks a given string into separate lines. When you read the property, it concatenates all those lines into one line again. When setting a property TStrings.SetTextStr
(in Classes.pas if you're interested) will split the string anyway #0
, #10
or #13
. That is, null characters, line feeds, and carriage returns. When re-concatenating all strings, it will use its property LineBreak
, which is initialized to a global variable sLineBreak
. The line break is also placed after the last line, so each line ends with LineBreak
. Therefore, the conversion does not have to be round trip.
So, there are two things to be learned from this:
- Do not treat compressed data as text.
- Don't use
TStrings
descendants to store objects that you don't want to handle with multiple lines.
Another good tip: don't use it string
as a generic type of data store. Use it for actual text only. Select TBytes
or to store arbitrary binary data TMemoryStream
. Using your example, you can compress a string like this:
var
ss: TStream;
ms: TMemoryStream;
begin
ss := TStringStream.Create('text');
try
ms := TMemoryStream.Create;
try
ShowMessage(IntToStr(ss.Size));
ZCompressStream(ss, ms);
ShowMessage(IntToStr(ms.Size));
finally
ms.Free;
end;
finally
ss.Free;
end;
end;
source to share