0

我需要创建一个程序,它可以搜索文档并从文档中填充元数据(例如候选人的简历),例如用户体验、用户技能、位置等。

为此,我喜欢使用 oracle 索引机制(Oracle 文本搜索),因为它索引文档中的所有数据。当它索引文档时,我喜欢首先从索引数据更新我的元数据字段,然后内容服务器将更新它们的索引。谁能帮助我如何了解索引器和事件的工作,我将在这些事件上捕获并进行一些修改以更新我的元数据。

我需要更新元数据,因为要求是:

搜索过滤条件的广泛选择(在简历中搜索而不仅仅是形成关键字): - 多个参数之间的布尔搜索 - 可以搜索技能、经验年限、特定公司、教育资格、地理/位置和个人资料的提交日期。- 搜索推荐人、姓名、团队、BU 等 - 结果窗口足够大的结果、过滤器 - 预定义的简历过滤标准,以协助筛选求职者在工作门户网站上申请的情况

4

2 回答 2

0

You are looking at this problem from the wrong end. The indexer (OracleText Search) is a powerful and complex tool embedded inside the workings of the database. What you are suggesting is to interpret the results of text indexing and use this as metadata for your content - if I am not mistaken? OracleText generates huge amounts of data and literally "chops" up documents word for word. For you to make meaningful metadata from this would be a huge task. Instead you should be looking at the capture of the metadata from as close to the source as possible. This could be done using (if you are using MS-OFFICE) Word vbScript when the user saves to the repository or filesystem. I believe you can fully manipulate the metadata in a document at savetime. You will of course need to install the Oracle WebCenter Content Desktop Integration suite.

于 2014-03-07T18:43:42.993 回答
0

查看 Oracle WebCenter Capture。WebCenter Capture 可以扫描文档并允许在文档上自动标记元数据。WebCenter Capture 与 WebCenter Content (WCC) 集成,允许您将扫描的文档直接签入 WebCenter Content。

http://www.oracle.com/technetwork/middleware/webcenter/content/index-090596.html

于 2014-03-10T01:52:53.727 回答