0

使用 HibernateSearch 我想索引我的 H2 嵌入式数据库。
调用此代码:

EntityManager em = articleDao.getEntityManager();
FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(em);
try {
    fullTextEntityManager.createIndexer().progressMonitor(new CustomMassIndexerProcessMonitor()).startAndWait();
} catch (InterruptedException e) {
    e.printStackTrace();
}  

索引几分钟后,它会引发以下异常:

2013-09-04 09:01:41 ERROR LogErrorHandler.handleException():83 - HSEARCH000058: HSEARCH000116: Unexpected error during MassIndexer operation
java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Unknown Source)
    at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
    at java.lang.AbstractStringBuilder.append(Unknown Source)
    at java.lang.StringBuffer.append(Unknown Source)
    at java.io.StringWriter.write(Unknown Source)
    at org.h2.util.IOUtils.copyAndCloseInput(IOUtils.java:201)
    at org.h2.util.IOUtils.readStringAndClose(IOUtils.java:301)
    at org.h2.value.ValueLobDb.getString(ValueLobDb.java:226)
    at org.h2.jdbc.JdbcResultSet.getString(JdbcResultSet.java:296)
    at org.hibernate.type.descriptor.sql.VarcharTypeDescriptor$2.doExtract(VarcharTypeDescriptor.java:66)
    at org.hibernate.type.descriptor.sql.BasicExtractor.extract(BasicExtractor.java:64)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:261)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:257)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:247)
    at org.hibernate.type.AbstractStandardBasicType.hydrate(AbstractStandardBasicType.java:332)
    at org.hibernate.persister.entity.AbstractEntityPersister.hydrate(AbstractEntityPersister.java:2912)
    at org.hibernate.loader.Loader.loadFromResultSet(Loader.java:1673)
    at org.hibernate.loader.Loader.instanceNotYetLoaded(Loader.java:1605)
    at org.hibernate.loader.Loader.getRow(Loader.java:1505)
    at org.hibernate.loader.Loader.getRowFromResultSet(Loader.java:713)
    at org.hibernate.loader.Loader.processResultSet(Loader.java:943)
    at org.hibernate.loader.Loader.doQuery(Loader.java:911)
    at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
    at org.hibernate.loader.Loader.doList(Loader.java:2526)
    at org.hibernate.loader.Loader.doList(Loader.java:2512)
    at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
    at org.hibernate.loader.Loader.list(Loader.java:2337)
    at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
    at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
    at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)
    at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.loadList(IdentifierConsumerEntityProducer.java:151)
Hibernate Search: entityloader-2, CustomMassIndexerProcessMonitor entitiesLoaded(10)
Hibernate Search: collectionsloader-2, CustomMassIndexerProcessMonitor documentsAdded(1)
Hibernate Search: collectionsloader-2, CustomMassIndexerProcessMonitor documentsBuilt(1)
Hibernate Search: collectionsloader-3, CustomMassIndexerProcessMonitor documentsAdded(1)
Hibernate Search: collectionsloader-3, CustomMassIndexerProcessMonitor documentsBuilt(1)
2013-09-04 09:01:47 ERROR LogErrorHandler.handleException():83 - HSEARCH000058: HSEARCH000116: Unexpected error during MassIndexer operation
java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOfRange(Unknown Source)
    at java.lang.String.<init>(Unknown Source)
    at java.lang.StringBuffer.toString(Unknown Source)
    at java.io.StringWriter.toString(Unknown Source)
    at org.h2.util.IOUtils.readStringAndClose(IOUtils.java:302)
    at org.h2.value.ValueLobDb.getString(ValueLobDb.java:226)
    at org.h2.jdbc.JdbcResultSet.getString(JdbcResultSet.java:296)
    at org.hibernate.type.descriptor.sql.VarcharTypeDescriptor$2.doExtract(VarcharTypeDescriptor.java:66)
    at org.hibernate.type.descriptor.sql.BasicExtractor.extract(BasicExtractor.java:64)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:261)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:257)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:247)
    at org.hibernate.type.AbstractStandardBasicType.hydrate(AbstractStandardBasicType.java:332)
    at org.hibernate.persister.entity.AbstractEntityPersister.hydrate(AbstractEntityPersister.java:2912)
    at org.hibernate.loader.Loader.loadFromResultSet(Loader.java:1673)
    at org.hibernate.loader.Loader.instanceNotYetLoaded(Loader.java:1605)
    at org.hibernate.loader.Loader.getRow(Loader.java:1505)
    at org.hibernate.loader.Loader.getRowFromResultSet(Loader.java:713)
    at org.hibernate.loader.Loader.processResultSet(Loader.java:943)
    at org.hibernate.loader.Loader.doQuery(Loader.java:911)
    at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
    at org.hibernate.loader.Loader.doList(Loader.java:2526)
    at org.hibernate.loader.Loader.doList(Loader.java:2512)
    at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
    at org.hibernate.loader.Loader.list(Loader.java:2337)
    at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
    at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
    at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)
    at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.loadList(IdentifierConsumerEntityProducer.java:151)
    at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.loadAllFromQueue(IdentifierConsumerEntityProducer.java:117)
    at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.run(IdentifierConsumerEntityProducer.java:94)
    at org.hibernate.search.batchindexing.impl.OptionallyWrapInJTATransaction.run(OptionallyWrapInJTATransaction.java:132)

似乎 H2 util 类之一在尝试从 DB 读取时引发了此异常。我试图增加堆使用:'-Xms1024m -Xmx2048m',但这没有帮助:(
场景如下。我的 H2 数据库的每个条目都有一个字段类型 CLOB。如果我将小内容写入该字段,那么一切很好,不会抛出错误。但是如果我在这些字段中有很大的内容(每个 900kb),那么在索引过程中会抛出错误。

我正在使用以下罐子:
hibernate-entitymanager 4.2.4.Final
h2 1.3.173
hibernate-search 4.4.0.Alpha1

这是我的持久性单元配置:

<persistence-unit name="hibernateSearchH2TestPersistenceUnit" transaction-type="RESOURCE_LOCAL">
    <provider>org.hibernate.ejb.HibernatePersistence</provider>

    <mapping-file>META-INF/queriesForTest.xml</mapping-file>

    <class>com.kaidex.db.entity.DocStatus</class>
    <class>com.kaidex.db.entity.DocType</class>
    <class>com.kaidex.db.entity.Article</class>
    <class>com.kaidex.db.entity.Issuer</class>
    <class>com.kaidex.db.entity.PublishingSource</class>

    <properties>
        <property name="hibernate.connection.url" value="jdbc:h2:D:\\kaidextestdb;CIPHER=XTEA"/>

        <property name="hibernate.dialect" value="org.hibernate.dialect.H2Dialect"/>
        <property name="hibernate.connection.driver_class" value="org.h2.Driver"/>

        <property name="hibernate.connection.username" value="sa"/>
        <property name="hibernate.connection.password" value="filepass userpass"/>

        <property name="hibernate.format_sql" value="true"/>
        <property name="hibernate.show_sql" value="false" />
        <property name="hibernate.hbm2ddl.auto" value="update" />


        <property name="hibernate.search.default.directory_provider" value="filesystem"/> 
        <property name="hibernate.search.default.indexBase" value="D:\lucene"/>
        <property name="hibernate.search.lucene_version" value="LUCENE_36"/>
    </properties>
</persistence-unit>

更新。添加了实体配置:

@Entity(name="Article")
@Table(name="Article", schema="Kaidexdb")
@Indexed
public class Article {
    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private long id;
    ... 
    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    @Column(columnDefinition="CLOB")
    private String contentRo;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    @Column(columnDefinition="CLOB")
    private String contentRu;

    @IndexedEmbedded
    @ManyToOne
    @JoinColumn(name="docType_id", nullable=false)  
    private DocType docType;

    @IndexedEmbedded
    @ManyToOne
    @JoinColumn(name="docStatus_id", nullable=false)    
    private DocStatus docStatus;

    @IndexedEmbedded
    @ManyToOne
    @JoinColumn(name="issuer_id", nullable=false)
    private Issuer issuer;

    @IndexedEmbedded
    @ManyToOne
    @JoinColumn(name="ps_id", nullable=false)
    private PublishingSource publishingSource;
...


@Entity(name="DocStatus")
@Table(name="DocStatus", schema="Kaidexdb")
@Indexed
public class DocStatus {
    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private Long id;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRo;
    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRu;

    @OneToMany(mappedBy="docStatus", targetEntity=Article.class)
    private List<Article> articles; 
...

@Entity(name="DocType")
@Table(name="DocType", schema="Kaidexdb")
@Indexed
public class DocType {

    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private long id;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    @Column(unique=true)
    private String shortName;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRo;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRu;

    @OneToMany(mappedBy="docType", targetEntity=Article.class)
    private List<Article> articles;
...


@Entity(name="Issuer")
@Table(name="Issuer", schema="Kaidexdb")
@Indexed
public class Issuer {
    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private long id;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String shortNameRo; 

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRo;  

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRu;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    @Column(name="parent_id")
    private long parentId; 

    @OneToMany(mappedBy="issuer", targetEntity=Article.class)
    private List<Article> articles;
...


@Entity
@Table(name="PublishingSource", schema="Kaidexdb")
@Indexed
public class PublishingSource {
    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private long id;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRo;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRu;


    @OneToMany(mappedBy="publishingSource", targetEntity=Article.class)
    private List<Article> articles; 

有人可以帮我解决这个问题吗?
也许我应该对我的 H2 嵌入式数据库进行一些特定的配置,以通知 H2 我使用了一个大的 CLOB 字段?

先感谢您。

4

2 回答 2

0

我已经部分解决了这个问题。我减少了 Hibernate Search 每个查询加载的对象数量:

EntityManager em = articleDao.getEntityManager();
FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(em);
fullTextEntityManager.createIndexer().batchSizeToLoadObjects(2).startAndWait();

如果 H2 开发人员能在下一个版本中解决这个问题,那就太好了。
我用 HSQLDB 和 Apache Derby 测试了相同的代码,这些 db 驱动程序没有抛出任何异常。

于 2013-09-06T13:00:39.563 回答
0

根据堆栈跟踪,Hibernate 搜索尝试将 CLOB 作为字符串从数据库中加载(使用java.sql.ResultSet.getString)。因此,H2 必须完全加载 CLOB。此外,Hibernate Search 似乎在内存中保留了如此大的字符串列表:

at org.h2.value.ValueLobDb.getString(ValueLobDb.java:226)
at org.h2.jdbc.JdbcResultSet.getString(JdbcResultSet.java:296)
at org.hibernate.type.descriptor.sql.VarcharTypeDescriptor$2.doExtract(VarcharTypeDescriptor.java:66)
...
at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)
at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.loadList(IdentifierConsumerEntityProducer.java:151)

所以它看起来像是 Hibernate Search 中的一个问题。之前已经报告过与 Hibernate Search 相关的内存问题(我知道这是一个旧版本),但我首先会尝试使用不同版本的 Hibernate Search,特别是因为您使用的是 Alpha 版本(4.4.0.阿尔法1)。可能是已知问题。

于 2013-09-04T09:48:36.040 回答