单词 RR 在一种情况下被标记为 NN,在另一种情况下被标记为 NNP。申请人在一种情况下被标记为 NN,在另一种情况下被标记为 JJ。为什么同一个词会有这种差异?不应该将任何以大写字母开头的内容标记为 NNP 吗?
testb
Out[45]:
['applicant',
'applicant',
'applicant',
'applicant',
'RR',
'RR',
'Khan',
'he',
'how',
'let',
'she',
'that',
'there',
'what',
'where',
'firm']
[nltk.pos_tag([i]) for i in testb]
Out[46]:
[[('applicant', 'NN')],
[('applicant', 'NN')],
[('applicant', 'NN')],
[('applicant', 'NN')],
[('RR', 'NN')],
[('RR', 'NN')],
[('Khan', 'NNP')],
[('he', 'PRP')],
[('how', 'WRB')],
[('let', 'VB')],
[('she', 'PRP')],
[('that', 'IN')],
[('there', 'RB')],
[('what', 'WP')],
[('where', 'WRB')],
[('firm', 'NN')]]
nltk.pos_tag(testb)
Out[47]:
[('applicant', 'JJ'),
('applicant', 'NN'),
('applicant', 'NN'),
('applicant', 'JJ'),
('RR', 'NNP'),
('RR', 'NNP'),
('Khan', 'NNP'),
('he', 'PRP'),
('how', 'WRB'),
('let', 'VB'),
('she', 'PRP'),
('that', 'IN'),
('there', 'EX'),
('what', 'WP'),
('where', 'WRB'),
('firm', 'NN')]