2

如何使用 Java 将所有 nn 个标签组合成一个短语标签

nsubj(martyrdom-4, Today-1)
cop(martyrdom-4, is-2)
det(martyrdom-4, the-3)
root(ROOT-0, martyrdom-4)
nn(Mukherjee-7, Dr-6)
prep_of(martyrdom-4, Mukherjee-7)
det(founder-9, the-8)
dep(tribute-17, founder-9)

prep_of(founder-9, Jan-11)
nn(body-15, Sangh-12)
nn(body-15, BJP-13)
nn(body-15, parent-14)
dep(tribute-17, body-15)
poss(tribute-17, My-16)
dep(martyrdom-4, tribute-17)
prep_to(tribute-17, him-19)

我想得到一个名词短语:

prep_of(founder-9,Jan-11)
nn(body-15, Sangh-12)
nn(body-15, BJP-13)
nn(body-15, parent-14)

输出应该是----------> jan sangh BJP parent

4

1 回答 1

1

I believe this is the dependency chain output from Stanford Parser. If yes, then you should already have the noun phrases (NP nodes) in the parsed tree of the sentence. You can extract the lowest level NP nodes from the parse tree to get the required noun phrases. For example, for the sentence "Today is the martyrdom of Dr. Mukherjee, founder of Jan Sangh BJP.", the parse tree would be:

(ROOT
  (S
    (NP (NNP Today))
    (VP (VBZ is)
      (NP
        (NP (DT the) (NN martyrdom))
        (PP (IN of)
          (NP
            (NP (NNP Dr.) (NNP Mukherjee))
            (, ,)
            (NP
              (NP (NN founder))
              (PP (IN of)
                (NP (NNP Jan) (NNP Sangh) (NNP BJP))))))))
    (. .)))

In this tree, the lowest level NPs containing proper nouns (NNPs) will give you most (all of them not being the named entities that you need) of the noun phrases that you need. In this case the output will be:

(NP (NNP Today))
(NP (NNP Dr.) (NNP Mukherjee))
(NP (NNP Jan) (NNP Sangh) (NNP BJP))
于 2012-08-25T23:51:49.060 回答