3

我想阅读一些日志,但我不能。到目前为止,我已经尝试过:

  • hadoop fs -text <file>

但我唯一得到的是:( INFO compress.CodecPool: Got brand-new decompressor [.lz4]对于.snappy也是如此)

  • val rawRdd = spark.sparkContext.sequenceFile[BytesWritable, String](<file>)

它还给我<file> is not a SequenceFile

  • val rawRdd = spark.read.textFile(<file>)

在这种情况下java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z

  • 将文件下载到本地文件系统,然后用于lz4 -d <file>解压并尝试查看内容

  • 我关注了这个 SO 帖子

with open (snappy_file, "r") as input_file: data = input_file.read() decompressor = snappy.hadoop_snappy.StreamDecompressor() uncompressed = decompressor.decompress(data)

但是当我想的时候print(uncompressed),我只会得到' 'b

4

0 回答 0