0

I am new to spark. I need to execute a function myfunc() in parallel and then just append all the generated dataframes.

Currently i am using for loop which I guess runs in sequence. How can I improve it?

import databricks.koalas as ks
appended_data=[]

for path in paths_list:
    data = myfunc(path)
    appended_data.append(data)

appended_data = ks.concat(appended_data)
4

0 回答 0