spacy - 在多个 GPU 上运行 spacy 以预测 ner

Question

我正在使用 spacy 来预测使用 gpu 的 ner 标签。我有一台更大的机器，有 8 个 GPU，我想使用所有这些 GPU。

使用所有 gpu 的一种方法是在每个 gpu 上运行 8 个不同的脚本，然后使用 kafka 队列将文本传递给每个脚本。

有没有其他方法可以让我们使用单个脚本来使用所有 gpus 来预测 ner.

score 0 · Accepted Answer

你试过用 MPI 运行 Spacy 吗？我自己正在试验以下代码，所以请告诉我它是否有效！

from mpi4py import MPI
import cupy

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
    data = ["His friend Nicolas J. Smith is here with Bart Simpon and Fred."*100]
else:
    data = None

unit = comm.scatter(data, root=0)

with cupy.cuda.Device(rank):
    import spacy
    from thinc.api import set_gpu_allocator, require_gpu
    set_gpu_allocator("pytorch")
    require_gpu(rank)
    nlp = spacy.load('en_core_web_lg')
    nlp.add_pipe("merge_entities")
    tmp_list = []
    for doc in nlp.pipe(unit):
        res = " ".join([t.text if not t.ent_type_ else t.ent_type_ for t in doc])
        tmp_list.append(res)

result = comm.gather(tmp_list, root=0)

if comm.rank == 0:
    print (result)
else:
    result = None

spacy - 在多个 GPU 上运行 spacy 以预测 ner

1 回答 1

Related

Reference