Java - GC big string
I have a method for reading and parsing an extremely long XML file. The XML file is read into a string, which is then parsed by another class. However, this results in Java using a lot of memory (~ 500MB). Usually the program runs for about 30MB, but when parse () is called, it grows to 500MB. However, when the parse () operation is performed, the memory usage does not go back up to 30 MB; instead, it stays at 500MB.
I tried installing s = null
and calling System.gc()
but memory usage still stays at 500MB.
public void parse(){
try {
System.out.println("parsing data...");
String path = dir + "/data.xml";
InputStream i = new FileInputStream(path);
BufferedReader reader = new BufferedReader(new InputStreamReader(i));
String line;
String s = "";
while ((line = reader.readLine()) != null){
s += line + "\n";
}
... parse ...
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Any ideas?
Thank.
source to share
Resolving memory leak problem
You must Close
BufferReader
at the end to close the thread and free the associated system resources. You can close both InputStream
and BufferReader
. However, closing BufferReader
actually closes the stream.
It is generally best to add it permanently and close it.
finally
{
i.Close();
reader.Close();
}
Improved approach description try-with-resources
try (BufferedReader br = new BufferedReader(new FileReader(path)))
{
return br.readLine();
}
Bonus note
Use StringBuilder instead of string concatenation
String
does not allow adding. Each append / concatenate String
does not create a new object and returns it. This is due to the fact that String
immutable
- he cannot change his inner state.
On the other hand, it StringBuilder
is mutable. When you call Append
it modifies the internal char array rather than creating a new string object.
Thus, it is more efficient to use the memory of the StringBuilder if you want to add many lines.
source to share
Just a side note: the try-with-resources block will help you a lot with IO objects like these readers.
try(InputStream i = new FileInputStream(path);
BufferedReader reader = new BufferedReader(new InputStreamReader(i))) {
//your reading here
}
This will ensure that these objects are disposed by calling the close () function on them, regardless of how the method block exits (success, exception ...). Closing these objects can also help free up memory.
The thing that is probably causing significant slowdowns and probably bloat in memory usage is your string concatenation. The call s += line + "\n"
is fine for one concatenation, but the operator +
actually has to create a new instance each time String
and copy the characters from the concatenated ones. The class StringBuilder
was designed for this very purpose. :)
source to share
500MB is caused by parsing, so it has nothing to do with string or BufferedReader
. This is the DOM of the parsed XML. Let go of that and your memory usage will return.
But why read the entire file into a line? It is a waste of time and space. Just parse the input directly from the file.
source to share
You should keep in mind that the call System.gc();
will not necessarily do the garbage collection, but it prompts the GC to do it, and it can ignore it if the GC doesn't want to garbage collect. it is better to use a StringBuilder to reduce the number of strings created in memory because it only creates a String when toString () is called on it.
source to share