python - 在没有 GPU 的情况下提高拥抱面变压器模型的模型预测时间

问问题 2021-11-23T07:40:37.217

131 次

我在很多任务中使用了拥抱脸变压器模型，效果很好，但唯一的问题是响应时间。生成结果大约需要 6-7 秒，有时甚至需要大约 15-20 秒。我在使用 GPU 的 google collab 上进行了尝试，GPU 的性能在处理结果的几秒钟内太快了。由于我当前服务器上的 GPU 存在限制，是否有任何方法可以增加仅使用 CPU 的模型的响应时间。

目前使用 GooglePegasus 模型进行文本摘要。 https://huggingface.co/google/pegasus-xsum

和 Parrot 释义：内部使用来自变形金刚的 bert 模型 https://huggingface.co/prithivida/parrot_paraphraser_on_T5

这是飞马模型的代码：

from transformers import PegasusTokenizer, TFPegasusForConditionalGeneration

model = TFPegasusForConditionalGeneration.from_pretrained('google/pegasus-xsum')
tokenizer = PegasusTokenizer.from_pretrained('google/pegasus-xsum')

ARTICLE_TO_SUMMARIZE = (
"This is text to summarize"
)
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='tf')

# Generate Summary
summary_ids = model.generate(inputs['input_ids'])
print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

稍有改善也会有所帮助！

python - 在没有 GPU 的情况下提高拥抱面变压器模型的模型预测时间

0 回答 0

Related

Reference