Program hangs after fetching 100 lines containing CLOB
I am fetching one text column (CLOB) from a table in database "remote" H2 (actually on local disk, but using tcp to access it) and after fetching the first 100 rows the program hangs on fetching the next row of the result. If, on the other hand, I am accessing the same database as the embedded database, there is no problem. If I try to display the rows of a table with an H2 console application accessing the database using the Server method (i.e. Tcp), then I get the following error:
IO Exception: "java.io.IOException: org.h2.message.DbException: The object is already closed [90007-164]";
"lob: null table: 14 id: 1" [90031-164] 90031/90031
Here is the program. If I uncomment the call that sets the system property, the program works. I also tried to get the column using a character stream or just a call to getString driven by the constant USE_STREAM. There is no difference in the results:
import java.sql.*;
import java.util.*;
import java.io.*;
public class Jdbc4
{
private static final boolean USE_STREAM = false;
public static void main(String[] args) throws Exception
{
//System.setProperty("h2.serverResultSetFetchSize", "50");
Connection conn = null;
try {
Class.forName("org.h2.Driver").newInstance();
conn = DriverManager.getConnection("jdbc:h2:tcp://localhost/file:C:/h2/db/test/test;IFEXISTS=TRUE", "sa", "");
Statement stmt = conn.createStatement();
String sql = "select select_variables from ipm_queues";
ResultSet rs = stmt.executeQuery(sql);
int count = 0;
while (rs.next()) {
++count;
String s;
if (USE_STREAM) {
Clob clob = rs.getClob(1);
Reader rdr = clob.getCharacterStream();
char[] cbuf = new char[1024];
StringBuffer sb = new StringBuffer();
int len;
while ((len = rdr.read(cbuf, 0, cbuf.length)) != -1)
sb.append(cbuf, 0, len);
rdr.close();
s = sb.toString();
clob.free();
}
else
s = rs.getString(1);
System.out.println(count + ": " + s);
}
}
finally {
if (conn != null)
conn.close();
}
}
}
Here is the DDL for creating the table (you can see that it was originally a MySql table):
CREATE TABLE `ipm_queues` (
`oid` bigint NOT NULL,
`queue_id` varchar(256) NOT NULL,
`store_id` bigint NOT NULL,
`creation_time` datetime NOT NULL,
`status` bigint NOT NULL,
`deleted` bigint NOT NULL,
`last_mod_time` datetime NOT NULL,
`queue_name` varchar(128),
`select_variables` text,
`where_clause` text,
`from_table` varchar(128),
`order_by` varchar(256),
`from_associate_table` varchar(256),
`from_view` varchar(128)
);
ALTER TABLE ipm_queues
ADD CONSTRAINT ipm_queues_pkey PRIMARY KEY (oid);
CREATE UNIQUE INDEX ipm_queues_key_idx ON ipm_queues(queue_id, store_id);
CREATE INDEX ipm_queues_str_idx ON ipm_queues(store_id);
source to share
I believe I understand the reason for the hang. I researched the simplest use case for the h2.serverResultSetFetchSize value 600 which is larger than the 523 rows I know I have. As I mentioned, I can get the first 3 rows (one CLOB column) ok and then I either hang on fetching the 4th row or get an "Object already closed" exception.
It turns out that the actual string containing the first three columns looks rather short in length and the getInputStream method in the org.h2.value.ValueLobDb class already has data and just returns a ByteArrayInputStream built on that data. The fourth row data is still on the server side and therefore an actual RemoteInputStream needs to be created to process the data from the LOB on the server side.
Here's what looks like the problem: Class org.h2.server.TcpServerThread caches these LOBs in, for example, SmallLRUCache. This cache appears to be designed to support only the least recently reserved LOBs !!! The default size of this cache is set by the system property h2.serverCachedObjects, which defaults to 64, while the default fetch size is 100. So even if I hadn't overridden the default h2.serverResultSetFetchSize property, if all my rows had large enough columns requiring cached LOBs, any fetch size> 64 would cause the LOB representing the first row to be flushed out of the cache and I couldn't even get the first row.
The LRU cache seems to be the wrong structure for storing LOBs that are in the active result set. Of course, the default cache size that is smaller than the default fetch size seems less ideal.
source to share