4

Trying to process a large binary file using PySpark, but always getting OutofMemoryError. Tried all possible ways such as increasing executor/driver memory, repartitioning the rdd. Will a single large binary file gets partitioned in spark? If not, how can we process binary files. The binary file which I am using currently is more than 2GB.

4

0 回答 0