Is it possible to modify Lucene 2.2 to add Arabic analyzer and if anyone have done this already where can I get source/jar
3 回答
Lucene 3.0.1 has Arabic Analyzer. It is in the contrib package.
You can upgrade to Lucene 3.0.1 to get this working out of the box. You probably will not be able to use this as it is for Lucene 2.2 since TokenStream APIs have changed in this release. But, back-porting changes to 2.2 shouldn't be very difficult, in case you don't wish to migrate to latest Lucene release.
有人问我如何在 lucene 2.4 上获得阿拉伯语和波斯语支持
所以这些在这里被非正式地向后移植:http: //people.apache.org/~rmuir/
http://people.apache.org/~rmuir/lucene-analyzers-2.4.1_with_arabic_and_farsi.jar http://people.apache.org/~rmuir/arabicFarsiLucene241_contrib.patch http://people.apache.org/~rmuir /arabicFarsiLucene241_core.patch
这意味着您只需升级到 2.4.1,这可能比升级到 2.9 或 3.0 更容易。
希望这可以帮助
或者,您可以尝试将lucene-hunspell用于分析仪。这目前正在使用 Lucene 主干 - 我不知道它是否适用于 Lucene 3.0.1。这是Robert Muir 的解释和字典列表,包括阿拉伯语。我相信你也可以反向移植这个。Shashikant 的建议似乎更容易实施,而这个可能质量更好。