I have a python script running on an AWS EC2 (on AWS Linux), and the scripts pulls a parquet file from S3 into Pandas dataframe. I'm now migrating to new AWS account and setting up a new EC2. This time when executing the same script on python virtual environment I get "Segmentation Fault" and the execution ends.
import pandas as pd
import numpy as np
import pyarrow.parquet as pq
import s3fs
import boto3
from fastparquet import write
from fastparquet import ParquetFile
print("loading...")
df = pd.read_parquet('<my_s3_path.parquet>', engine='fastparquet')
All packages were imported and all S3 and AWS configurations were set.
when executing the full script I get:
loading...
Segmentation fault
As you can see not much to work with. I've been googling for a few hours and I saw many speculations and reasons for this symptom. I'll appreciate the help here.