HSQLDB optimize 1.000.000 bulk insert

2019-07-01 23:45发布

问题:

I need to insert 1.000.000 entries in HSQLDB on Tomcat as fast as possible, but 64m (default MaxPermSize on Tomcat) not enough for this code, and I get "OutOfMemoryError" (I want to insert on default settings).

connection.setAutoCommit(false);
PreparedStatement preparedStatement = connection.prepareStatement("INSERT INTO USER (firstName, secondName) VALUES(?,?)");
for (int i = 0; i < 1000000; i++) {
    preparedStatement.setString(1, "firstName");
    preparedStatement.setString(2, "secondName");
    preparedStatement.addBatch();
}
preparedStatement.executeBatch();
connection.commit();

I reed this: http://hsqldb.org/doc/2.0/guide/deployment-chapt.html#dec_bulk_operations. I set "SET FILES LOG FALSE" but it doesn't help.

  1. Is there way to insert 1.000.000 entries with MaxPermSize=64m ?
  2. Why on this code tomcat eat so much memory ? There 1.000.000 * 19 (length of "firstName" + "secondName") * 2 (bytes on 1 symbol) = ~40Mb.
  3. Why in memory tables inserts go faster then in cached tables ? Am I doing something wrong ?

回答1:

  1. Maybe try to do it by smaller sets. It will consume less memory and would probably be more efficient.
  2. Calculating the size of the memory is much much more harder. For example, you won't have firstName stored 1 million time, the value will be internalized, but you will have to store the reference 1 million times. Then, all your libraries consume memories, etc...
  3. What do you call "cached tables"?

Try that, you will consume less memory at least :

connection.setAutoCommit(false);
PreparedStatement preparedStatement = connection.prepareStatement("INSERT INTO USER (firstName, secondName) VALUES(?,?)");
for (int i = 0; i < 1000000; i++) {
    preparedStatement.setString(1, "firstName");
    preparedStatement.setString(2, "secondName");
    preparedStatement.addBatch();
    if(i % 1000 == 0)
         preparedStatement.executeBatch();
}
preparedStatement.executeBatch();
connection.commit();

EDIT : Are you sure it is because of the perm size? Can you put the stack trace?