lucene - Hibernate Search 自动更新组合字段

Question

我有一个具有多个名称相关属性（firstName、lastName、title）的 Person 实体。所有与名称相关的属性都应存储在单个 lucene 索引字段“fullName”中。

@Indexed
@Entity
public class Person {
   ...
   private String firstName;
   private String lastName;
   private String title;

   @Field(store=Store.NO, index=Index.TOKENIZED)
   public String getFullName() {
      return firstName + " " + lastName + " " + title;
   }
}

我面临的唯一问题是在更新名称相关属性时自动更新索引中的 fullName。

有没有办法告诉 Hibernate Search fullName 是一个组合字段，并且必须在其中一个部分更改时更新？也许是这样的？

@ComposedOf({"firstName", "lastName", "title"})

谢谢！

score 3 · Accepted Answer

您的问题有几种解决方案，您选择的解决方案可能是个人喜好问题（您也可以应用它们的组合）：

检查属性 _hibernate.search.enable_dirty_check_ 并确保在您的情况下将其设置为false。默认值为true。有关更多信息，请参阅在线文档 - http://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/
将@Field注释也添加到 firstName、lastName 和 title。您会获得更大的索引大小，但通常这并不重要。作为副作用，脏检查将起作用（假设您的 JPA 注释是正确的。例如，我假设 getFullName 是瞬态的）
使用类桥并可选择删除getFullName。使用类桥也会自动禁用脏检查优化

score 1 · Accepted Answer

@Indexed
@Entity
public class Person {
   ...
   @Field(name="fullName") String firstName;
   @Field(name="fullName") String lastName;
   @Field(name="fullName") String title;
}

This is possible as you have chosen TOKENIZED and I'm assuming your analyzer is set to split the tokens on whitespace as you're adding whitespace to separate them: you can have multiple repetitions of a same field, the result is almost the same as splitting the compound terms (I say almost as it won't be able to determine ordering of terms in case you need a PhraseQuery looking for a specific order of keywords).

For more complex cases you would use a ClassBridge which disables the dirty-checking optimisation which has been annoying you in this case: Hibernate Search tracks if any persistent field was actually written to to decide if it can skip expensive reindexing operations but is then unable to detect such tricks.

lucene - Hibernate Search 自动更新组合字段

2 回答 2

Related

Reference