我正在学习谷歌云自然语言处理 API。API基础页面说明analyze_syntax()
方法的响应应该是
- 句子“清单”(附文字和分析)
- 令牌的“列表”(带文本和分析)
请参考这个 -句法分析基础
相反,我收到的输出为:
sentences {
text {
content: "Once again i am typing a sentence to see if it finally return a proper value."
}
}
sentences {
text {
content: "The problem is that offsets are -1 for all tokens which is not proper."
begin_offset: 78
}
}
tokens {
text {
content: "Once"
}
part_of_speech {
tag: ADV
}
dependency_edge {
head_token_index: 1
label: ADVMOD
}
lemma: "Once"
}
tokens {
text {
content: "again"
begin_offset: 5
}
part_of_speech {
tag: ADV
}
dependency_edge {
head_token_index: 4
label: ADVMOD
}
lemma: "again"
}
tokens {
text {
content: "i"
begin_offset: 11
}
part_of_speech {
tag: PRON
case: NOMINATIVE
number: SINGULAR
person: FIRST
}
dependency_edge {
head_token_index: 4
label: NSUBJ
}
lemma: "i"
}
注意没有
- 句子的“列表”,每一个都被分析
- 标记的“列表”,每一个都被分析
但是每个句子,每个单词都经过单独处理。为什么我的结果与图示的结果不同?
这是实际的代码。
import os
# import argparse
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "C:\\Users\\user\\Downloads\\test-ee23cf382897.json"
def analyze(user_said):
"""Changed to suit my needs"""
client = language.LanguageServiceClient()
document = types.Document(content=user_said, type=enums.Document.Type.PLAIN_TEXT)
syntax = client.analyze_syntax(document=document, encoding_type='UTF8')
print(syntax)
with open('syntax_analysis.txt', 'w') as file:
file.write(str(syntax))
#
# if __name__ == '__main__':
# parser = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
# parser.add_argument('user_said', help='The filename of the movie review you would like to analyze.')
# args = parser.parse_args()
# analyze(args.user_said)
附加信息:
- Python 3.6.5
- PyCharm 社区版 2018.1