如果您希望进程产生和终止,并且它们的内存可以被远程访问,那么您需要一些东西来在此期间保存您的数据。一种解决方案是构建一个流程来进行分配,然后从 ipc 创建 cudf 列。我不确定如何在 python 中执行此操作。在 c++ 中,它非常简单。
类似的东西
//In the code handling your allocations
gdf_column col;
cudaMemHandle_t handle_data, handle_valid;
cudaIpcGetMemHandle(&handle,col.data);
cudaIpcGetMemHandle(&valid,col.valid);
//In the code consuming it
gdf_column col;
//deserialize these by reading from a file or however you want to make this
//binary data avaialable
cudaMemHandle_t handle_data, handle_valid;
cudaIpcOpenMemHandle ( (void**) &col.data, cudaIpcMemHandle_t handle, cudaIpcMemLazyEnablePeerAccess );
cudaIpcOpenMemHandle ( (void**) &col.valid, cudaIpcMemHandle_t handle, cudaIpcMemLazyEnablePeerAccess );
还有来自 RAPIDs 贡献者的第三方解决方案,例如 BlazingSQL,它们在 python 中提供此功能,并为 cudfs 提供 SQL 接口。
在这里你会做类似的事情
#run this code in your service to basically select your entire table and get it
#as a cudf
from blazingsql import BlazingContext
import pickle
bc = BlazingContext()
bc.create_table('performance', some_valid_gdf) #you can also put a file or list of files here
result= bc.sql('SELECT * FROM main.performance', ['performance'])
with open('context.pkl', 'wb') as output:
pickle.dump(bc, output, pickle.HIGHEST_PROTOCOL)
with open('result.pkl', 'wb') as output:
pickle.dump(result, output, pickle.HIGHEST_PROTOCOL)
#the following code can be run on another process as long as result
# contains the same information from above, its existence is managed by blazingSQL
from blazingsql import BlazingContext
import pickle
with open('context.pkl', 'rb') as input:
bc = pickle.load(input)
with open('result.pkl', 'rb') as input:
result = pickle.load(input)
#Get result object
result = result.get()
#Create GDF from result object
result_gdf = result.columns
免责声明,我为 Blazing 工作。