How to persist LARGE BLOBs (>100MB) in Oracle usin

2019-01-17 00:12发布

I'm struggling to find a way to insert LARGE images (>100MB, mostly TIFF format) in my Oracle database, using BLOB columns.

I've searched thoroughly across the web and even in StackOverflow, without being able to find an answer to this problem.
First of all, the problem...then a short section on the relevant code (java classes/configuration), finally a third section where i show the junit test i've written to test image persistence (i receive the error during my junit test execution)

Edit: i've added a section, at the end of the question, where i describe some tests and analysis using JConsole

The problem

I receive an java.lang.OutOfMemoryError: Java heap space error using hibernate and trying to persist very large images/documents:

java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:133)
at org.hibernate.type.descriptor.java.DataHelper.extractBytes(DataHelper.java:190)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:123)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:47)
at org.hibernate.type.descriptor.sql.BlobTypeDescriptor$4$1.doBind(BlobTypeDescriptor.java:101)
at org.hibernate.type.descriptor.sql.BasicBinder.bind(BasicBinder.java:91)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:283)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:278)
at org.hibernate.type.AbstractSingleColumnStandardBasicType.nullSafeSet(AbstractSingleColumnStandardBasicType.java:89)
at org.hibernate.persister.entity.AbstractEntityPersister.dehydrate(AbstractEntityPersister.java:2184)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2430)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2874)
at org.hibernate.action.EntityInsertAction.execute(EntityInsertAction.java:79)
at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:273)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:265)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:184)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:51)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1216)
at it.paoloyx.blobcrud.manager.DocumentManagerTest.testInsertDocumentVersion(DocumentManagerTest.java:929)

The code (domain objects, repository classes, configuration)

Here is the stack of technologies i'm using (from DB to business logic tier). I use JDK6.

  • Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
  • ojdbc6.jar (for 11.2.0.3 release)
  • Hibernate 4.0.1 Final
  • Spring 3.1.GA RELEASE

I've two domain classes, mapped in a one-to-many fashion. A DocumentVersion has many DocumentData, each of one can represent different binary content for the same DocumentVersion.

Relevant extract from DocumentVersion class:

@Entity
@Table(name = "DOCUMENT_VERSION")
public class DocumentVersion implements Serializable {

private static final long serialVersionUID = 1L;
private Long id;
private Set<DocumentData> otherDocumentContents = new HashSet<DocumentData>(0);


@Id
@GeneratedValue(strategy = GenerationType.TABLE)
@Column(name = "DOV_ID", nullable = false)
public Long getId() {
    return id;
}

@OneToMany
@Cascade({ CascadeType.SAVE_UPDATE })
@JoinColumn(name = "DOD_DOCUMENT_VERSION")
public Set<DocumentData> getOtherDocumentContents() {
    return otherDocumentContents;
}

Relevant extract from DocumentData class:

@Entity
@Table(name = "DOCUMENT_DATA")
public class DocumentData {

private Long id;

/**
 * The binary content (java.sql.Blob)
 */
private Blob binaryContent;

@Id
@GeneratedValue(strategy = GenerationType.TABLE)
@Column(name = "DOD_ID", nullable = false)
public Long getId() {
    return id;
}

@Lob
@Column(name = "DOD_CONTENT")
public Blob getBinaryContent() {
    return binaryContent;
}

Here are my Spring and Hibernate configuration main parameters:

<bean id="sessionFactory"
    class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
    <property name="dataSource" ref="dataSource" />
    <property name="packagesToScan" value="it.paoloyx.blobcrud.model" />
    <property name="hibernateProperties">
        <props>
            <prop key="hibernate.dialect">org.hibernate.dialect.Oracle10gDialect</prop>
            <prop key="hibernate.hbm2ddl.auto">create</prop>
            <prop key="hibernate.jdbc.batch_size">0</prop>
            <prop key="hibernate.jdbc.use_streams_for_binary">true</prop>
        </props>
    </property>
</bean>
<bean class="org.springframework.orm.hibernate4.HibernateTransactionManager"
    id="transactionManager">
    <property name="sessionFactory" ref="sessionFactory" />
</bean>
<tx:annotation-driven transaction-manager="transactionManager" />

My datasource definition:

<bean class="org.apache.commons.dbcp.BasicDataSource"
    destroy-method="close" id="dataSource">
    <property name="driverClassName" value="${database.driverClassName}" />
    <property name="url" value="${database.url}" />
    <property name="username" value="${database.username}" />
    <property name="password" value="${database.password}" />
    <property name="testOnBorrow" value="true" />
    <property name="testOnReturn" value="true" />
    <property name="testWhileIdle" value="true" />
    <property name="timeBetweenEvictionRunsMillis" value="1800000" />
    <property name="numTestsPerEvictionRun" value="3" />
    <property name="minEvictableIdleTimeMillis" value="1800000" />
    <property name="validationQuery" value="${database.validationQuery}" />
</bean>

where properties are taken from here:

database.driverClassName=oracle.jdbc.OracleDriver
database.url=jdbc:oracle:thin:@localhost:1521:devdb
database.username=blobcrud
database.password=blobcrud
database.validationQuery=SELECT 1 from dual

I've got a service class, that delegates to a repository class:

@Transactional
public class DocumentManagerImpl implements DocumentManager {

DocumentVersionDao documentVersionDao;

public void setDocumentVersionDao(DocumentVersionDao documentVersionDao) {
    this.documentVersionDao = documentVersionDao;
}

and now the relevant extracts from repository classes:

public class DocumentVersionDaoHibernate implements DocumentVersionDao {

@Autowired
private SessionFactory sessionFactory;

@Override
public DocumentVersion saveOrUpdate(DocumentVersion record) {
    this.sessionFactory.getCurrentSession().saveOrUpdate(record);
    return record;
}

The JUnit test that causes the error

If i run the following unit test i've got the aforementioned error (java.lang.OutOfMemoryError: Java heap space):

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = { "classpath*:META-INF/spring/applicationContext*.xml" })
@Transactional
public class DocumentManagerTest {

@Autowired
protected DocumentVersionDao documentVersionDao;

@Autowired
protected SessionFactory sessionFactory;

@Test
public void testInsertDocumentVersion() throws SQLException {

    // Original mock document content
    DocumentData dod = new DocumentData();
    // image.tiff is approx. 120MB
    File veryBigFile = new File("/Users/paoloyx/Desktop/image.tiff");
    try {
        Session session = this.sessionFactory.getCurrentSession();
        InputStream inStream = FileUtils.openInputStream(veryBigFile);
        Blob blob = Hibernate.getLobCreator(session).createBlob(inStream, veryBigFile.length());
        dod.setBinaryContent(blob);
    } catch (IOException e) {
        e.printStackTrace();
        dod.setBinaryContent(null);
    }

    // Save a document version linked to previous document contents
    DocumentVersion dov = new DocumentVersion();
    dov.getOtherDocumentContents().add(dod);
    documentVersionDao.saveOrUpdate(dov);
    this.sessionFactory.getCurrentSession().flush();

    // Clear session, then try retrieval
    this.sessionFactory.getCurrentSession().clear();
    DocumentVersion dbDov = documentVersionDao.findByPK(insertedId);
    Assert.assertNotNull("Il document version ritornato per l'id " + insertedId + " è nullo", dbDov);
    Assert.assertNotNull("Il document version recuperato non ha associato contenuti aggiuntivi", dbDov.getOtherDocumentContents());
    Assert.assertEquals("Il numero di contenuti secondari non corrisponde con quello salvato", 1, dbDov.getOtherDocumentContents().size());
}

The same code works against a PostreSQL 9 installation. The images is being written in the database. Debugging my code, i've been able to find that the PostgreSQL jdbc drivers writes on the database using a buffered output stream....while the Oracle OJDBC driver tries to allocate all at once all the byte[]representing the image.

From the error stack:

java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:133)

Is the error due to this behavior? Can anyone give me some insights on this problem?

Thanks everyone.

Memory Tests with JConsole

Thanks to the suggestions received for my question, i've tried to do some simple tests to show memory usage of my code using two different jdbc drivers, one for PostgreSQL and one for Oracle. Test setup:

  1. The test has been conducted using the JUnit test described in the previous section.
  2. JVM Heap Size has been set to 512MB, using parameter -Xmx512MB
  3. For Oracle database, I've used ojdbc6.jar driver
  4. For Postgres database, I've used 9.0-801.jdbc3 driver (via Maven)

First test, with a file of approx 150MB

In this first test, both Oracle and Postgres passed the test (this is BIG news). The file is sized 1/3 of available JVM heap size. Here the picture of JVM memory consumption:

Testing Oracle, 512MB Heap Size, 150MB file Testing Oracle, 512MB Heap Size, 150MB file

Testing PostgreSQL, 512MB Heap Size, 150MB file Testing PostgreSQL, 512MB Heap Size, 150MB file

Second test, with a file of approx 485MB

In this second test, only Postgres passed the test and Oracle failed it . The file is sized very near the size of the available JVM heap space. Here the picture of JVM memory consumption:

Testing Oracle, 512MB Heap Size, 485MB file Testing Oracle, 512MB Heap Size, 485MB file

Testing PostgreSQL, 512MB Heap Size, 485MB file Testing PostgreSQL, 512MB Heap Size, 485MB file

Analysis of the tests:

It seems that PostgreSQL driver handles memory without surpassing a certain threshold, while Oracle driver behaves very differently.

I can't honestly explain why Oracle jdbc driver leads me to error (the same java.lang.OutOfMemoryError: Java heap space) when using file sized near the available heap space.

Is there anyone that can give me more insights? Thanks a lot for you help :)

5条回答
劫难
2楼-- · 2019-01-17 00:39

Have you tried to define LobHandler and its version for oracle OracleLobHandler on your session factory?

Here is an example:

<bean id="sessionFactory" class="org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean">
    <property name="dataSource" ref="oracleDocDataSource"/>
    <property name="annotatedClasses">
        <list>
        ...
        </list>
    </property>
    <property name="lobHandler">
        <bean class="org.springframework.jdbc.support.lob.OracleLobHandler">
            <property name="nativeJdbcExtractor">
                <bean class="org.springframework.jdbc.support.nativejdbc.WebSphereNativeJdbcExtractor"/>
            </property>
        </bean>
    </property>
</bean>

UPDATE

I've just realized that the speech is about hibernate 4.

查看更多
3楼-- · 2019-01-17 00:49

Personally I store files up to 200MB in Oracle BLOB columns using Hibernate, so I can assure it works. So...

You should try newer version of Oracle JDBC driver. It seems that this behavior of using byte arrays instead of streams was changed a little bit over time. And the drivers are backward compatible. I'm not sure, if that's going to fix your problem, but it works for me. Additionally You should switch to org.hibernate.dialect.Oracle10gDialect - which retires the use of the oracle.jdbc.driver package in favor of oracle.jdbc - and it might also help.

查看更多
叼着烟拽天下
4楼-- · 2019-01-17 00:50

I was having the same problems as you in attempting to map using "blob" type. Here is a link to a post I made on the hibernate site: https://forum.hibernate.org/viewtopic.php?p=2452481#p2452481

Hibernate 3.6.9
Oracle Driver 11.2.0.2.0
Oracle Database 11.2.0.2.0

To fix the problem I used code that had a custom UserType for the Blob, I had the return type be java.sql.Blob.

Here are the key method implementations of this UserType:

public Object nullSafeGet(ResultSet rs, String[] names, Object owner) throws HibernateException, SQLException {

   Blob blob = rs.getBlob(names[0]);
   if (blob == null)
      return null;

   return blob;
}

public void nullSafeSet(PreparedStatement st, Object value, int index)
     throws HibernateException, SQLException {
   if (value == null) {
      st.setNull(index, sqlTypes()[0]);
   }
   else {
      InputStream in = null;
      OutputStream out = null;
      // oracle.sql.BLOB
      BLOB tempBlob = BLOB.createTemporary(st.getConnection(), true, BLOB.DURATION_SESSION);
      tempBlob.open(BLOB.MODE_READWRITE);
      out = tempBlob.getBinaryOutputStream();
      Blob valueAsBlob = (Blob) value;
      in = valueAsBlob.getBinaryStream();
      StreamUtil.toOutput(in, out);
      out.flush();
      StreamUtil.close(out);
      tempBlob.close();
      st.setBlob(index, tempBlob);
      StreamUtil.close(in);
   }
}
查看更多
老娘就宠你
5楼-- · 2019-01-17 00:54

It's not best solution, but you can allow Java to use more memory with -Xmx parametr

Edit: You should try to analyse the problem more into depth, try to use JConsole. It helps you see the memory load.

Even with Postgres you might get neaar the heap size limit, but not cross it because the loaded driver takes a bit less memory.

With default settings your heam size limit is about half your physical memory. Try how much bigger blob you can save into postgres.

查看更多
Viruses.
6楼-- · 2019-01-17 01:05

I just discovered this question when I was having the same problem with Oracle and Hibernate. The issue is in the Hibernate blob handling. It seems to copy the blob to memory depending on the Dialect in use. I guess they do so, because it's required by some databases/drivers. For Oracle though, this behaviour does not seem to be required.

The fix is pretty simple, just create a custom OracleDialect containing this code:

public class Oracle10DialectWithoutInputStreamToInsertBlob extends Oracle10gDialect {
    public boolean useInputStreamToInsertBlob() {
        return false;
    }
}

Next you need to configure your session factory to use this Dialect. I've tested it with the ojdbc6-11.2.0.1.0 driver towards Oracle 11g, and confirmed that this fixes the issue with memory consumption.

If some of you tries this with another Oracle database and/or with a different Oracle driver I would love to hear if it works for you. If it works with several configurations, I'll send a pull request to the Hibernate team.

查看更多
登录 后发表回答